UCL for Code in Research

4/9 Research Software Engineering with Python (COMP233) - Data Formats

In this episode I'll be discussing data formats such as CSV, JSON and YAML. My guest is Nick Radcliffe from Stochastic Solutions and the Uni. Edinburgh. Nick's expertise is in data science and he has a lot to share about data, data formats and how to use them.

Links

  • https://www.linkedin.com/in/njradcliffe/ Nick's LinkedIn profile
  • https://en.wikipedia.org/wiki/Comma-separated_values CSV formats
  • https://www.json.org/json-en.html JSON
    • https://json-ld.org JSON for linked data
    • https://json-schema.org JSON schema
  • https://yaml.org YAML
  • https://parquet.apache.org Parquet by Apache
  • https://hdfgroup.github.io/hdf5/ HDF5

Libraries

  • https://numpy.org
  • https://scipy.org
  • https://scikit-learn.org/stable/
  • http://www.tdda.info test driven data analysis

Don't be shy - say Hi

This podcast is brought to you by the Advanced Research Computing Centre of the University College London, UK.
Producer and Host: Peter Schmidt