Data-Forge

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
GitHub
1.34k
Created 7 years ago, last commit a month ago
12 contributors
510 commits
Stars added on GitHub, month by month
12
1
2
3
4
5
6
7
8
9
10
11
2023
2024
Stars added on GitHub, per day, on average
Yesterday
=
Last week
0.0
/day
Last month
+0.2
/day
Last 12 months
+0.2
/day
npmPackage on NPM
Monthly downloads on NPM
12
1
2
3
4
5
6
7
8
9
10
11
2023
2024
README

Data-Forge

The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.

Implemented in TypeScript.
Used in JavaScript ES5+ or TypeScript.

To learn more about Data-Forge visit the home page.

Read about Data-Forge for data science in the book JavaScript for Data Science.

Love this? Please star this repo and click here to support my work

Build Status npm version License

Please note that this TypeScript repository replaces the previous JavaScript version of Data-Forge.

BREAKING CHANGES

As of v1.6.9 the dependencies Sugar, Lodash and Moment have been factored out (or replaced with smaller dependencies). This more than halves the bundle size. Hopefully this won't cause any problems - but please log an issue if something changes that you weren't expecting.

As of v1.3.0 file system support has been removed from the Data-Forge core API. This is after repeated issues from users trying to get Data-Forge working in the browser, especially under AngularJS 6.

Functions for reading and writing files have been moved to the separate code library Data-Forge FS.

If you are using the file read and write functions prior to 1.3.0 then your code will no longer work when you upgrade to 1.3.0. The fix is simple though, where usually you would just require in Data-Forge as follows:

const dataForge = require('data-forge');

Now you must also require in the new library as well:

const dataForge = require('data-forge');
require('data-forge-fs');

Data-Forge FS augments Data-Forge core so that you can use the readFile/writeFile functions as in previous versions and as is shown in this readme and the guide.

If you still have problems with AngularJS 6 please see this workaround: #3 (comment)

Install

To install for Node.js and the browser:

npm install --save data-forge

If working in Node.js and you want the functions to read and write data files:

npm install --save data-forge-fs

Quick start

Data-Forge can load CSV, JSON or arbitrary data sets.

Parse the data, filter it, transform it, aggregate it, sort it and much more.

Use the data however you want or export it to CSV or JSON.

Here's an example:

const dataForge = require('data-forge');
require('data-forge-fs'); // For readFile/writeFile.

dataForge.readFileSync('./input-data-file.csv') // Read CSV file (or JSON!)
    .parseCSV()
    .parseDates(["Column B"]) // Parse date columns.
    .parseInts(["Column B", "Column C"]) // Parse integer columns.
    .parseFloats(["Column D", "Column E"]) // Parse float columns.
    .dropSeries(["Column F"]) // Drop certain columns.
    .where(row => predicate(row)) // Filter rows.
    .select(row => transform(row)) // Transform the data.
    .asCSV() 
    .writeFileSync("./output-data-file.csv"); // Write to output CSV file (or JSON!)

From the browser

Data-Forge has been tested with Browserify and Webpack. Please see links to examples below.

If you aren't using Browserify or Webpack, the npm package includes a pre-packed browser distribution that you can install and included in your HTML as follows:

<script language="javascript" type="text/javascript" src="node_modules/data-forge/dist/web/index.js"></script>

This gives you the data-forge package mounted under the global variable dataForge.

Please remember that you can't use data-forge-fs or the file system functions in the browser.

Features

  • Import and export CSV and JSON data and text files (when using Data-Forge FS).
  • Or work with arbitrary JavaScript data.
  • Many options for working with your data:
    • Filtering
    • Transformation
    • Extracting subsets
    • Grouping, aggregation and summarization
    • Sorting
    • And much more
  • Great for slicing and dicing tabular data:
    • Add, remove, transform and generate named columns (series) of data.
  • Great for working with time series data.
  • Your data is indexed so you have the ability to merge and aggregate.
  • Your data is immutable! Transformations and modifications produce a new dataset.
  • Build data pipeline that are evaluated lazily.
  • Inspired by Pandas and LINQ, so it might feel familiar!

Contributions

Want a bug fixed or maybe to improve performance?

Don't see your favourite feature?

Need to add your favourite Pandas or LINQ feature?

Please contribute and help improve this library for everyone!

Fork it, make a change, submit a pull request. Want to chat? See my contact details at the end or reach out on Gitter.

Platforms

Documentation

Resources

Contact

Please reach and tell me what you are doing with Data-Forge or how you'd like to see it improved.

Support the developer

Click here to support the developer.