sqlitepipe

Alexander Weld

Repository: https://gitlab.com/xicalango/sqlitepipe

Crate: https://crates.io/crates/sqlitepipe

sqlitepipe

For some reason I often find myself in a situation where I wanted to store the output of command invocations into a sqlite database for later analysis.

For example "for all files in this directory, store the output of ffprobe -show_format" or "for all files, store the output of magick -format %c histogram:info:".

So far I always ended up writing "the same" python script for this, but I always thought it'd be so much easier if I could just pipe the output of a command into a sqlite database, and sqlitepipe was born.

The most basic way to invoke sqlitepipe is by just piping data into it:

ffprobe -show_format image.png | sqlitepipe

This creates (or appends) to a database called stdin.data.sqlite:

sqlite3 -table stdin.data.sqlite3 'select * from data;'
+-------------------------------------+
|                blob                 |
+-------------------------------------+
| [FORMAT]                            |
| filename=image.png                  |
| nb_streams=1                        |
| nb_programs=0                       |
| nb_stream_groups=0                  |
| format_name=png_pipe                |
| format_long_name=piped png sequence |
| start_time=N/A                      |
| duration=N/A                        |
| size=2662                           |
| bit_rate=N/A                        |
| probe_score=99                      |
| [/FORMAT]                           |
+-------------------------------------+

Invocating again will simply append the data:

ffprobe -show_format image2.jpg | sqlitepipe
sqlite3 -table stdin.data.sqlite3 'select * from data;'
+-------------------------------------+
|                blob                 |
+-------------------------------------+
| [FORMAT]                            |
| filename=image.png                  |
| nb_streams=1                        |
| nb_programs=0                       |
| nb_stream_groups=0                  |
| format_name=png_pipe                |
| format_long_name=piped png sequence |
| start_time=N/A                      |
| duration=N/A                        |
| size=2662                           |
| bit_rate=N/A                        |
| probe_score=99                      |
| [/FORMAT]                           |
+-------------------------------------+
| [FORMAT]                            |
| filename=image2.jpg                 |
| nb_streams=1                        |
| nb_programs=0                       |
| nb_stream_groups=0                  |
| format_name=image2                  |
| format_long_name=image2 sequence    |
| start_time=0.000000                 |
| duration=0.040000                   |
| size=29800                          |
| bit_rate=5960000                    |
| probe_score=50                      |
| [/FORMAT]                           |
+-------------------------------------+

For most basic usecases this already suffices. For example to find all png files I could use the following query:

select * from data where blob glob '*format_name=png_pipe*';

Additional features

Arbitrary values

Storing single blobs without context might not be super useful so I also added a way to insert arbitrary values via command line arguments:

filename=image.png
ffprobe -show_format $filename | sqlitepipe -v "file=$filename" -v "date=$(date)"
sqlite3 -table stdin.data.sqlite3 'select * from data;'
+-----------+------------------------------+-------------------------------------+
|   file    |             date             |                blob                 |
+-----------+------------------------------+-------------------------------------+
| image.png | Fri Mar  6 13:19:51 CET 2026 | [FORMAT]                            |
|           |                              | filename=image.png                  |
|           |                              | nb_streams=1                        |
|           |                              | nb_programs=0                       |
|           |                              | nb_stream_groups=0                  |
|           |                              | format_name=png_pipe                |
|           |                              | format_long_name=piped png sequence |
|           |                              | start_time=N/A                      |
|           |                              | duration=N/A                        |
|           |                              | size=2662                           |
|           |                              | bit_rate=N/A                        |
|           |                              | probe_score=99                      |
|           |                              | [/FORMAT]                           |
+-----------+------------------------------+-------------------------------------+

Line-based input

Sometimes I want to split the input by lines instead of just inserting everything as a blob, so I added a mode to do that:

ls -l image* | sqlitepipe -l
sqlite3 -table stdin.data.sqlite3 'select * from data;'
+-----------------------------------------------------+
|                        line                         |
+-----------------------------------------------------+
| lrwxrwxrwx 1 alexx alexx  7 Mar  6 13:24 image1.png |
| lrwxrwxrwx 1 alexx alexx 10 Mar  6 13:25 image2.jpg |
| lrwxrwxrwx 1 alexx alexx 10 Mar  6 13:25 image.png  |
+-----------------------------------------------------+

More

You can check out the readme for a few more usage examples: README.md.

Performance

I didn't benchmark anything yet, but I suspect the performance to be not great, especially when calling sqlitepipe in a loop. So far I prioritized usability over performance, but this might change later.

One could imagine a flow like this to improve performance when inserting large amounts of data:

sqlitepipe --start-transaction
for f in *; do
  ffprobe -show_format $f | sqlitepipe
done
sqlitepipe --commit-transaction