References
Project-Related
Joyce, J., B. Adams, J. Doyle, K. Fillingham, M. Iannucci, A. Kerney, K. Knee, D. Moretti, J. Quintrell, D. Snowden, T. C. Vance, and M. Wengren. (2024). A Workflow for Serving Model Data in the Cloud to a Broader Community. 40th Conference on Environmental Information Processing Technologies (EIPT), Baltimore, MD, American Meteorological Society 28 Jan–2 February, 2024. https://ams.confex.com/ams/104ANNUAL/40eipt/papers/viewonly.cgi?password=106568&username=438646
Joyce, J., Knee, K., Moretti, D., Morse, C., Quintrell, J., Snowden, D., Tripp, P., and T. Vance. (2023). Architecting a Cloud-Native Service-Based Ecosystem for DMAC. 39th Conference on Environmental Information Processing Technologies (EIPT), Denver, CO, American Meteorological Society 9-12 Jan, 2023. https://ams.confex.com/ams/103ANNUAL/meetingapp.cgi/Paper/409023
Iannucci, M. and J. Joyce. Improving Access to NOAA National Ocean Service Model Data with Kerchunk and XPublish. Pangeo Showcase. Oct 11, 2023. https://discourse.pangeo.io/t/pangeo-showcase-improving-access-to-noaa-national-ocean-service-model-data-with-kerchunk-and-xpublish/3725
Pangeo
Abernathy, Ryan. (2020). Big Arrays, Fast: Profiling Cloud Storage Read Throughput. Pangeo Gallery. http://gallery.pangeo.io/repos/earthcube2020/ec20_abernathey_etal/cloud_storage.html
Gowan, T., Horel, J., Jacques, A., and Kovac, A. (2022, April 8). Using Cloud Computing to Analyze Model Output Archived in Zarr Format. Journal of Atmospheric and Oceanic Technology https://journals.ametsoc.org/view/journals/atot/39/4/JTECH-D-21-0106.1.xml
Hamman, Joe. (2020, March 9). Publishing Xarray Datasets via a Zarr compatible REST API. Medium. https://medium.com/pangeo/xpublish-ff788f900bbf
Stern, C., Abernathy, R., Hamman, J., Wegener, R., Lepore, C., Harkins, S., and A. Merose. (2022, February 10). Pangeo Forge: Crowdsourcing Analysis-Ready, Cloud Optimized Data Production. Frontiers in Climate. https://www.frontiersin.org/articles/10.3389/fclim.2021.782909/full
Stuebe, David. Optimizations for Kerchunk aggregation and Zarr I/O at scale for Machine Learning. Pangeo Showcase. March 6, 2024. https://discourse.pangeo.io/t/pangeo-showcase-optimizations-for-kerchunk-aggregation-and-zarr-i-o-at-scale-for-machine-learning/4074
Data Strategy
Baker, Tristan. (2021, February 17). Intuit’s Data Mesh Strategy. Intuit Engineering. https://medium.com/intuit-engineering/intuits-data-mesh-strategy-778e3edaa017
Broda, Eric. (2022, August 19). The Anatomy of a Data Product. Towards Data Science. https://towardsdatascience.com/the-anatomy-of-a-data-product-d3140f068311
Dehghani, Zhamak. (2019, May 20). How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh. https://martinfowler.com/articles/data-monolith-to-mesh.html
Marbou, Khalid and Kalinowski, M. (2022, May 18). Five Emerging Trends in Enterprise Data Management. Datanami. https://www.datanami.com/2022/05/18/five-emerging-trends-in-enterprise-data-management
Moses, Barr. (2022, May 24). The Rise and Fall of Data Governance (Again). Datanami. https://www.datanami.com/2022/05/24/the-rise-and-fall-of-data-governance-again/
Strengholt, Piethein. (2021, November 23). Data Domains and Data Products: Data mesh in practice. Towards Data Science. https://towardsdatascience.com/data-domains-and-data-products-64cc9d28283e
Software Architecture
Kleppmann, Martin. (2017). Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O’Reilly.
Strengholt, Piethein. (2021, December 15). Data Domains — Where do I start?: Strategic path towards designing an enterprise-scale data mesh. Towards Data Science. https://towardsdatascience.com/data-domains-where-do-i-start-a6d52fef95d1
Verma, Kislay. (2020, July 20). How to build a technology platform. https://kislayverma.com/technology/how-to-build-a-technology-platform/