A lightweight command-line SQL-like query engine for CSV files.
Inspired by MySQL's interactive shell, csv-query
lets you load large CSV files and perform SQL-style queries like SELECT
, WHERE
, ORDER BY
, LIMIT
, and even STATS
and EXPORT
. It's perfect for ML engineers, data analysts, and developers who want a blazing-fast way to inspect and slice datasets directly from the terminal.
- Interactive CLI: Run multiple queries in a MySQL-style REPL.
- SQL-like query support:
SELECT
,WHERE
,ORDER BY
,LIMIT
,DISTINCT
- Dataset inspection:
DESC
,STATS
, andHELP
commands - ML-focused features: Profiling, schema view, export results
- Output formatting: Pretty tables, JSON, and raw CSV
- Modular codebase for easy extension
csv query/
├── .gitignore
├── README.md
├── src/
│ ├── filter.py
│ ├── loader.py
│ ├── main.py
│ ├── parser.py
│ ├── tabulate.py
│ ├── utils.py
git clone https://github.com/AndyFerns/csv-query.git
cd csv-query
pip install -r requirements.txt
python src/main.py
csv-query> SELECT * FROM data.csv WHERE age > 30 ORDER BY name LIMIT 5;
csv-query> DESC;
csv-query> STATS;
csv-query> EXPORT output.csv;
csv-query> HELP;
csv-query> EXIT;
SQL-like query to filter and view data.
SELECT column1, column2 FROM file.csv WHERE condition ORDER BY column DESC LIMIT 10;
Supports logical and comparison operators
-
=
,!=
,<
,>
,<=
,>=
-
AND
,OR
,NOT
Ascending (default) or DESC
Restrict number of output rows
SELECT name, salary FROM employees.csv WHERE salary > 50000 ORDER BY salary DESC LIMIT 10;
Output defaults to a pretty table, but can be configured (future support) for:
-
Table (default)
-
JSON
-
CSV string
-
📌 STATS command for quick profiling
-
🧹 DISTINCT to check cardinality and deduplication
-
🧮 Export filtered datasets for preprocessing
-
🧭 Interactive CLI for ad-hoc data exploration
-
🗂 DESC + HELP for quick schema and command reference
Pull requests are welcome! Please document your code and follow the folder structure.
MIT License. Feel free to fork, extend, and modify.
Made by AndyFerns