During this coronavirus lock down, IGIS has set out to revamp its data infrastructure to address our growing needs for big data storage and management moving forward. In particular, over the past few years we have accumulated over a dozen terabytes of drone data and associated mapping products, constituting tens of thousands of project files, and the quantity of this data is only expected to keep growing. Typically this drone data has been processed on a number of local desktop computers, and then backed up onto RAID hard drives or the cloud for cold storage; however, this is far from ideal in terms of consistent organization, versioning and ease of distribution.
As a solution to the problem, IGIS purchased a web server, equipped with multiple virtual machines (for processing, analysis and web services) along with a 30TB RAID data store/repository. The repository was networked to our various IGIS computers and RAID storage devices, so that all of our drone data could be transferred over to it. After much consideration, we settled on a standardized file structure, which could accommodate both datasets from past and future drone projects, with room for growth as needed. A python script was written to automatically generate this file structure, with some metadata inputs for each project, and our previous projects' data were then moved into their appropriate slots in the new structure, while jettisoning unwanted intermediary processing files; freeing up a ton of storage space. It would be correct to assume that this process of moving data was quite time consuming. However, moving forward, it will be easy to automatically set up our projects' file structures right from the inception of every new project, beginning with running the python script in ArcGIS Pro's Jupyter Notebook utility in the field, to eventually be delivered to the server repository down the pipeline, in a nicely organized package (similar to what we would provide to our non-IGIS project collaborators).
That alone is a big step in the right direction, but it gets better. Because all of this data is now in a standardized file structure, with standardized folder naming conventions, scripting our ArcGIS portal to automatically connect with the data via the imager server was only a small step away. With this complete, now any IGIS team member can access our entire post-processed, GIS-ready, drone data inventory of layers via ArcGIS Online or ArcGIS Pro.
Ultimately this has been a big leap forward, in terms of IGIS's informatics infrastructure; to compliment our significantly evolved pipeline for drone data collection and processing, depicted below.
- Author: Andy Lyons
A Unique Data Science Summit
Yesterday, several of us in the IGIS Program participated remotely in a very interesting summit on data science in agriculture. The summit was sponsored by the National Institute of Food and Agriculture (NIFA), which is the funding arm of the US Department of Agriculture (USDA). The goal of the summit was to hear examples of how data collection systems and analytics are playing a transformative role in agriculture, in order to help USDA develop an investment strategy for the next phase of their data science grant program. USDA has been funding innovative big data projects for some time, and will soon be rolling out a new initiative called FACT (Food and Agriculture Cyberinformatics and Tools Initiative).
It was exciting to hear the presentations about how rapid advancements in data collection systems, processing, and analytics are changing agriculture across the US and overseas. From sensor systems that support precision farming, to a new generation of genomics studies, to smarter production models and decision support systems, innovation is happening everywhere. The recorded presentations are online.
What Should USDA Fund?
NIFA is actively soliciting input from experts in the field about funding priorities, and have set up an online forum where people can provide feedback and vote for ideas. The forum is centered around six questions that were also discussed in breakout groups at yesterday's summit. The questions ask what are the most promising opportunities for:
- data-driven advances in agriculture and the food-production systems?
- enhancing cross-sector advances in data applications?
- data-driven advances to address societal well-being and consumer demands?
- to address challenges of various facets of data management and application?
- to ensure future generations of data expertise?
- big data in communications, property rights, and communities?
Data Science in ANR
ANR Farm Advisors and Specialists have been exploring similar questions for years. To name just a couple of examples, the Precision Agriculture workgroup has been developing methods to measure and manage for in-field variability. ANR has also sponsored several apps-for-ag hackathons, including one they hosted this past summer in collaboration with the State Fair. Here at IGIS, we teach workshops on geospatial data analysis, data management, and remote sensing with drones. We also maintain ANR's network of Flux towers, and have digitized historical records from ANR's network of Research and Extension Centers.
What do YOU think?
Many people think data analytics will be the engine for the next revolution in agriculture - what do you think the priority areas should be? NIFA is soliciting input through their Ideas Engine through the end of October. Take this unique opportunity to help shape the future of agricultural data science by letting your voice be heard!