ysv

YAML-driven CSV formatter

CSV (Comma Separated Values) is a very common data storage, distribution, and exchange format. It is also very simple and widely supported.

CSV lacks standardization and typing though. Real world datasets oftentimes have inconsistent data formats, separators, quoting, - and if you want to use them you have to spend time tediously cleaning them up. That is major boring nonsense, is it not?

And here comes ysv.

input.csv
make,model,year
Ford,Fusion,2016
Chevrolet,Equinox,2013
Toyota,Camry,2018
Infiniti,Q50,2016
Mercedes-Benz,GLE,2020

In the configuration file, you specify the output dataset you need to get from the tool:

  • column names and order

  • where to take the data to fill each of the output columns

  • and what transformations to apply to the data

The resulting dataset may be saved on disk, uploaded to a remote host, imported into a database, - whatever you wish.

Disclaimer

ysv is at alpha stage of development, but it is already being used in production with benefit, which caused me to believe it may be useful to someone else. Feedback is greatly appreciated.

Last updated