UC ANR was a sponsor for the FOODIT: Fork to Farm meeting in June 2017: http://mixingbowlhub.com/events/food-fork-farm/. Many of us were there to learn about what was happening in the food-data-tech space and learn how UCANR can be of service. It was pretty cool. First, it was held in the Computer History Museum, which is rad. Second, the idea of the day was to link partners, industry, scientists, funders, and foodies, around sustainable food production, distribution, and delivery. Third, there were some rad snacks (pic below).
We had an initial talk from Mikiel Bakker from Google Food, who have broadened their thinking about food to include not just feeding Googlers, but also the overall food chain and food system sustainability. They have developed 5 "foodshots" (i.e. like "moonshot" thinking): 1) enable individuals to make better choices, 2) shift diets, 3) food system transparency, 4) reduce food losses, and 5) how to make a closed, circular food system.
We then had a series of moderated panels.
The Dean's List introduced a panel of University Deans, moderated by very own Glenda Humiston @UCANR, and included Helene Dillard (UCDavis), Andy Thulin (CalPoly), Wendy Wintersteen (Iowa State). Key discussion points included lack of food system transparency, science communication and literacy, making money with organics, education and training, farm sustainability and efficiency, market segmentation (e.g. organics), downstream processing, and consumer power to change food systems. Plus the Amazon purchase of Whole Foods.
The Tech-Enabled Consumer session featured 4 speakers from companies who feature tech around food. Katie Finnegan from Walmart, David McIntyre from Airbnb, Barbara Shpizner from Mattson, Michael Wolf from The Spoon. Pretty neat discussion around the way these diverse companies use tech to customize customer experience, provide cost savings, source food, contribute to a better food system. 40% of food waste is in homes, another 40% is in the consumer arena. So much to be done!
The session on Downstream Impacts for the Food Production System featured Chris Chochran from ReFed @refed_nowaste, Sabrina Mutukisna from The Town Kitchen @TheTownKitchen, Kevin Sanchez from the Yolo Food Bank @YoloFoodBank, and Justin Siegel from UC Davis International Innovation and Health. We talked about nutrition for all, schemes for minimizing food waste, waste streams, food banks, distribution of produce and protein to those who need them (@refed_nowaste and @YoloFoodBank), creating high quality jobs for young people of color in the food business (@TheTownKitchen), the amount of energy that is involved in the food system (David Lee from ARPA-E); this means 7% of our energy use in the US goes to CREATING FOOD WASTE. Yikes!
The session on Upstream Production Impacts from New Consumer Food Choices featured Ally DeArman from Food Craft Institute @FoodCraftInst, Micke Macrie from Land O' Lakes, Nolan Paul from Driscoll's @driscollsberry, and Kenneth Zuckerberg from Rabobank @Rabobank. This session got cut a bit short, but it was pretty interesting. Especially the Food Craft Institute, whose mission is to help "the small guys" succeed in the food space.
The afternoon sessions included some pitch competitions, deep dive breakouts and networking sessions. What a great day for ANR.
So much to learn! Here is my distillation of the main take-homes from last week.
Notes about the workshop in general:
- Making participants do organized homework and install software is necessary for complicated workshop content: http://neondataskills.org/workshop-event/NEON-Data-Insitute-2017
- NEON used tips from software carpentry workshops, particularly the green-and-pink sticky tag tip. When you put up a pink sticky on your computer, you need help; when you put up a green sticky on your computer, you are all good. This makes everything go smoother and means participants don't have to hold up their hand to get attention.
- Having lots of helpful, friendly faces around to help during the coding bits, and having access to the code when you got behind was critical.
- The workshop content and daily schedule:
NEON data and resources:
- http://www.neonscience.org/resources/data-tutorials
- All the NEON airborne data can be found here: http://www.neonscience.org/data/airborne-data
- For more on when data rolls out, sign up for the NEON eNews here: http://www.neonscience.org/
Other misc. tools:
- For cleaning messy data - check out OpenRefine - a FOS tool for cleaning messy data http://openrefine.org/
- Excel is cray-cray, best practices for spreadsheets: http://www.datacarpentry.org/spreadsheet-ecology-lesson/
- Morpho (from DataOne) to enter metadata: https://www.dataone.org/software-tools/morpho
- Pay attention to file size with your git repositories - check out: https://git-lfs.github.com/. Git is good for things you do with your hands (like code), not for large data.
- Markdown renderer: http://dillinger.io/
- MIT License, like Creative Commons for code: https://opensource.org/licenses/MIT
- There is a new project called "Feather" that allows compatibility between python and R: https://blog.rstudio.org/2016/03/29/feather/
- All the NEON airborne data can be found here: http://www.neonscience.org/data/airborne-data
- Information on the TIFF specification and TIFF tags here: http://awaresystems.be/, however their TIFF Tag Viewer is only for windows.
- All NEON point cloud classifications are done with LASTools. Go LASTools! https://rapidlasso.com/lastools/
- Check out pdal - like gdal for point clouds. It can be used from bash. Learned from my workshop neighbor Sergio Marconi https://www.pdal.io/
- Reflectance Tarps are made by GroupVIII http://www.group8tech.com/
- ATCOR http://www.rese.ch/products/atcor/ says we should be able to rely on 3-5% error on reflectance when atmospheric correction is done correctly (say that 10 times fast) with a well-calibrated instrument.
- NEON hyperspectral data is stored in HDF5 format. HDFView is a great tool for interrogating the metadata, among other things. https://support.hdfgroup.org/products/java/hdfview/
First of all, Pearl Street Mall is just as lovely as I remember, but OMG it is so crowded, with so many new stores and chains. Still, good food, good views, hot weather, lovely walk.
Welcome to Day 2! http://neondataskills.org/data-institute-17/day2/
Our morning session focused on reproducibility and workflows with the great Naupaka Zimmerman. Remember the characteristics of reproducibility - organization, automation, documentation, and dissemination. We focused on organization, and spent an enjoyable hour sorting through an example messy directory of misc data files and code. The directory looked a bit like many of my directories. Lesson learned. We then moved to working with new data and git to reinforce yesterday's lessons. Git was super confusing to me 2 weeks ago, but now I think I love it. We also went back and forth between Jupyter and python stand alone scripts, and abstracted variables, and lo and behold I got my script to run.
The afternoon focused on Lidar (yay!) and prior to coding we talked about discrete and waveform data and collection, and the opentopography (http://www.opentopography.org/) project with Benjamin Gross. The opentopography talk was really interesting. They are not just a data distributor any more, they also provide a HPC framework (mostly TauDEM for now) on their servers at SDSC (http://www.sdsc.edu/). They are going to roll out a user-initiated HPC functionality soon, so stay tuned for their new "pluggable assets" program. This is well worth checking into. We also spent some time live coding with Python with Bridget Hass working with a CHM from the SERC site in California, and had a nerve-wracking code challenge to wrap up the day.
Fun additional take-home messages/resources:
- ISO International standard for dates = YYYY-MM-DD
- Missing values in R = NA, in Python = -9999
- For cleaning messy data - check out OpenRefine - a FOS tool for cleaning messy data http://openrefine.org/
- Excel is cray-cray, best practices for spreadsheets: http://www.datacarpentry.org/spreadsheet-ecology-lesson/
- Morpho (from DataOne) to enter metadata: https://www.dataone.org/software-tools/morpho
- Pay attention to file size with your git repositories - check out: https://git-lfs.github.com/. Git is good for things you do with your hands (like code), not for large data.
- Funny how many food metaphors are used in tech teaching: APIs as a menu in a restaurant; git add vs git commit as a grocery cart before and after purchase; finding GIS data is sometimes like shopping for ingredients in a specialty grocery store (that one is mine)...
- Markdown renderer: http://dillinger.io/
- MIT License, like Creative Commons for code: https://opensource.org/licenses/MIT
- "Jupyter" means it runs with Julia, Python & R, who knew?
- There is a new project called "Feather" that allows compatibility between python and R: https://blog.rstudio.org/2016/03/29/feather/
- All the NEON airborne data can be found here: http://www.neonscience.org/data/airborne-data
- Information on the TIFF specification and TIFF tags here: http://awaresystems.be/, however their TIFF Tag Viewer is only for windows.
Thanks for everyone today! Megan Jones (our fearless leader), Naupaka Zimmerman (Reproducibility), Tristan Goulden (Discrete Lidar), Keith Krause (Waveform Lidar), Benjamin Gross (OpenTopography), Bridget Hass (coding lidar products).
Our home for the week
I left Boulder 20 years ago on a wing and a prayer with a PhD in hand, overwhelmed with bittersweet emotions. I was sad to leave such a beautiful city, nervous about what was to come, but excited to start something new in North Carolina. My future was uncertain, and as I took off from DIA that final time I basically had Tom Petty's Free Fallin' and Learning to Fly on repeat on my walkman. Now I am back, and summer in Boulder is just as breathtaking as I remember it: clear blue skies, the stunning flatirons making a play at outshining the snow-dusted Rockies behind them, and crisp fragrant mountain breezes acting as my Madeleine. I'm back to visit the National Ecological Observatory Network (NEON) headquarters and attend their 2017 Data Institute, and re-invest in my skillset for open reproducible workflows in remote sensing.
Day 1 Wrap Up from the NEON Data Institute 2017
What a day! http://neondataskills.org/data-institute-17/day1/
Attendees (about 30) included graduate students, old dogs (new tricks!) like me, and research scientists interested in developing reproducible workflows into their work. We are a mix of ages and genders. The morning session focused on learning about the NEON program (http://www.neonscience.org/): its purpose, sites, sensors, data, and protocols. NEON, funded by NSF and managed by Battelle, was conceived in 2004 and will go online for a 30-year mission providing free and open data on the drivers of and responses to ecological change starting in Jan 2018. NEON data comes from IS (instrumented systems), OS (observation systems), and RS (remote sensing). We focused on the Airborne Observation Platform (AOP) which uses 2, soon to be 3 aircraft, each with a payload of a hyperspectral sensor (from JPL, 426, 5nm bands (380-2510 nm), 1 mRad IFOV, 1 m res at 1000m AGL) and lidar (Optech and soon to be Riegl, discrete and waveform) sensors and a RGB camera (PhaseOne D8900). These sensors produce co-registered raw data, are processed at NEON headquarters into various levels of data products. Flights are planned to cover each NEON site once, timed to capture 90% or higher peak greenness, which is pretty complicated when distance and weather are taken into account. Pilots and techs are on the road and in the air from March through October collecting these data. Data is processed at headquarters.
In the afternoon session, we got through a fairly immersive dunk into Jupyter notebooks for exploring hyperspectral imagery in HDF5 format. We did exploration, band stacking, widgets, and vegetation indices. We closed with a fast discussion about TGF (The Git Flow): the way to store, share, control versions of your data and code to ensure reproducibility. We forked, cloned, committed, pushed, and pulled. Not much more to write about, but the whole day was awesome!
Fun additional take-home messages:
- NEON is amazing. I should build some class labs around NEON data, and NEON classroom training materials are available: http://www.neonscience.org/resources/data-tutorials
- Making participants do organized homework is necessary for complicated workshop content: http://neondataskills.org/workshop-event/NEON-Data-Insitute-2017
- HDF5 as an possible alternative data format for Lidar - holding both discrete and waveform
- NEON imagery data is FEDExed daily to headquarters after collected
- I am a crap python coder
- #whofallsbehindstaysbehind
- Tabs are my friend
Thanks to everyone today, including: Megan Jones (Main leader), Nathan Leisso (AOP), Bill Gallery (RGB camera), Ted Haberman (HDF5 format), David Hulslander (AOP), Claire Lunch (Data), Cove Sturtevant (Towers), Tristan Goulden (Hyperspectral), Bridget Hass (HDF5), Paul Gader, Naupaka Zimmerman (GitHub flow).
- Author: Andy Lyons
- Author: Sean Hogan
Last week we enjoyed attending the CalGIS 2017 conference in Oakland. This year the meeting was co-hosted with LocationCon, so it was larger than usual and had a good mix of participants from government, non-profits, academic and consulting companies. As expected there were a lot of people from California, but we also met a lot of people from other parts of the US.
The first day of the conference was devoted to workshops, and IGIS gave a half-day version of our workshop on drone technology and data analysis. This was well attended, and one of several sessions focused on drone technology. For us, this was also more preparation for our upcoming Dronecamp at the end of July.
The following day, Andy gave a presentation on some of the issues for scaling up drone capacity within a specific institutional setting like ANR. We discussed some of the issues we've been dealing with, including matching the scale of the data to the scale of science and management questions, outreach and training, regulatory compliance, and tailoring off-the-shelf technology for specific applications and contexts. The Q&A period highlighted a number of common challenges facing many organizations striving to take advantage of drone technology. One of the most common needs is developing institutional level policies to ensure safety and compliance with a dynamic array of federal, state, and local regulations. This discussion reminded us how fortunate ANR is to be backed by the UC Center of Excellence on Unmanned Aircraft System Safety, because many local agencies and public utilities are still trying to figure it out.
Another common theme that came up was management of the massive amount of data that drones can collect, and how to share and find drone data. Managing drone data is challenging because of the sheer volume of data. This makes many traditional strategies platforms unworkable and even cloud based solutions difficult to use because of long transfer times. Like many programs, we started managing drone data by adapting existing tools and established practices from other fields like GIS and remote sensing, which we have been refining as we learn more and as our drone service program grows to include more people and projects. We recently started documenting our data management system in a recent Tech Note (more about that in an upcoming blog), and are currently exploring a new online platform for dissemination in collaboration with ESRI (stay tuned for more info about that also). What became clear at the conference however is that the tools and platforms for drone data management are still catching up, and we have a long way to go before we can reach the capabilities of portals for more traditional GIS data, such as the State of CA Geoportal or even the National Map.
Other highlights from the conference were the many excellent talks, including presentations on using drones to create a very precise digital elevation model of a wetland restoration site, techniques for machine learning classification of aerial imagery, and the US Forest Service's system-wide database called EDW. We also heard about some of the exciting new features of Cal-Adapt, including an API that will dramatically simplify the process of creating decision support tools and other applications that require downscaled climate forecasts. Many of the presentations are available through the conference website, all of which are well worth checking out.