sqlitepipe
Repository: https://gitlab.com/xicalango/sqlitepipe
Crate: https://crates.io/crates/sqlitepipe
sqlitepipe
For some reason I often find myself in a situation where I wanted to store the output of command invocations into a sqlite database for later analysis.
For example "for all files in this directory, store the output of ffprobe -show_format" or "for all files, store the output of magick -format %c histogram:info:".
So far I always ended up writing "the same" python script for this, but I always thought it'd be so much easier if I could just pipe the output of a command into a sqlite database, and sqlitepipe was born.
The most basic way to invoke sqlitepipe is by just piping data into it:
ffprobe -show_format image.png | sqlitepipe
This creates (or appends) to a database called stdin.data.sqlite:
sqlite3 -table stdin.data.sqlite3 'select * from data;'+-------------------------------------+
| blob |
+-------------------------------------+
| [FORMAT] |
| filename=image.png |
| nb_streams=1 |
| nb_programs=0 |
| nb_stream_groups=0 |
| format_name=png_pipe |
| format_long_name=piped png sequence |
| start_time=N/A |
| duration=N/A |
| size=2662 |
| bit_rate=N/A |
| probe_score=99 |
| [/FORMAT] |
+-------------------------------------+
Invocating again will simply append the data:
ffprobe -show_format image2.jpg | sqlitepipe
sqlite3 -table stdin.data.sqlite3 'select * from data;'+-------------------------------------+
| blob |
+-------------------------------------+
| [FORMAT] |
| filename=image.png |
| nb_streams=1 |
| nb_programs=0 |
| nb_stream_groups=0 |
| format_name=png_pipe |
| format_long_name=piped png sequence |
| start_time=N/A |
| duration=N/A |
| size=2662 |
| bit_rate=N/A |
| probe_score=99 |
| [/FORMAT] |
+-------------------------------------+
| [FORMAT] |
| filename=image2.jpg |
| nb_streams=1 |
| nb_programs=0 |
| nb_stream_groups=0 |
| format_name=image2 |
| format_long_name=image2 sequence |
| start_time=0.000000 |
| duration=0.040000 |
| size=29800 |
| bit_rate=5960000 |
| probe_score=50 |
| [/FORMAT] |
+-------------------------------------+
For most basic usecases this already suffices. For example to find all png files I could use the following query:
select * from data where blob glob '*format_name=png_pipe*';Additional features
Arbitrary values
Storing single blobs without context might not be super useful so I also added a way to insert arbitrary values via command line arguments:
filename=image.png
ffprobe -show_format $filename | sqlitepipe -v "file=$filename" -v "date=$(date)"
sqlite3 -table stdin.data.sqlite3 'select * from data;'+-----------+------------------------------+-------------------------------------+
| file | date | blob |
+-----------+------------------------------+-------------------------------------+
| image.png | Fri Mar 6 13:19:51 CET 2026 | [FORMAT] |
| | | filename=image.png |
| | | nb_streams=1 |
| | | nb_programs=0 |
| | | nb_stream_groups=0 |
| | | format_name=png_pipe |
| | | format_long_name=piped png sequence |
| | | start_time=N/A |
| | | duration=N/A |
| | | size=2662 |
| | | bit_rate=N/A |
| | | probe_score=99 |
| | | [/FORMAT] |
+-----------+------------------------------+-------------------------------------+Line-based input
Sometimes I want to split the input by lines instead of just inserting everything as a blob, so I added a mode to do that:
ls -l image* | sqlitepipe -l
sqlite3 -table stdin.data.sqlite3 'select * from data;'+-----------------------------------------------------+
| line |
+-----------------------------------------------------+
| lrwxrwxrwx 1 alexx alexx 7 Mar 6 13:24 image1.png |
| lrwxrwxrwx 1 alexx alexx 10 Mar 6 13:25 image2.jpg |
| lrwxrwxrwx 1 alexx alexx 10 Mar 6 13:25 image.png |
+-----------------------------------------------------+More
You can check out the readme for a few more usage examples: README.md.
Performance
I didn't benchmark anything yet, but I suspect the performance to be not great, especially when calling sqlitepipe in a loop.
So far I prioritized usability over performance, but this might change later.
One could imagine a flow like this to improve performance when inserting large amounts of data:
sqlitepipe --start-transaction
for f in *; do
ffprobe -show_format $f | sqlitepipe
done
sqlitepipe --commit-transaction