Search: [data] - souvenir

04/08/2025

Dolt is a SQL database that you can fork, clone, branch, merge, push and pull just like a Git repository.

Connect to Dolt just like any MySQL database to read or modify schema and data. Version control functionality is exposed in SQL via system tables, functions, and procedures.

Or, use the Git-like command line interface to import CSV files, commit your changes, push them to a remote, or merge your teammate's changes. All the commands you know for Git work exactly the same for Dolt.

Grist https://www.getgrist.com/

25/07/2025

Grist is a modern relational spreadsheet. It combines the flexibility of a spreadsheet with the robustness of a database.

dbt https://github.com/dbt-labs/dbt-core

04/07/2025

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Kestra https://github.com/kestra-io/kestra

29/01/2025

Kestra is an open-source, event-driven orchestration platform that makes both scheduled and event-driven workflows easy. By bringing Infrastructure as Code best practices to data, process, and microservice orchestration, you can build reliable workflows directly from the UI in just a few lines of YAML.

Streamlit https://streamlit.io/

05/03/2023

Streamlit is an open source app framework in Python language. It helps us create web apps for data science and machine learning in a short time.

Flyte https://flyte.org/

27/03/2022

Flyte is an open-source, container-native, structured programming and distributed processing platform implemented in Golang. It enables highly concurrent, scalable and maintainable workflows for machine learning and data processing.

FillDB http://filldb.info/

31/12/2021

FillDB is a free tool that lets you quickly generate large volumes of custom data in MySql format to use in testing software and populating databases with random data.

pydantic https://pydantic-docs.helpmanual.io/

07/11/2021

Data validation and settings management using python type annotations.

jina https://github.com/jina-ai/jina

01/10/2021

Jina is a neural search framework that allows to build deep learning search applications in minutes. It provides scalable indexing, querying, understanding of any data: video, image, long/short text, music, source code, PDF, etc.

Superset https://superset.apache.org/

13/04/2021

Apache Superset is a modern data exploration and visualization platform

Splitgraph https://www.splitgraph.com/

25/07/2020

Splitgraph is a data management, building and sharing tool inspired by Docker and Git that works on top of PostgreSQL and integrates seamlessly with anything that uses PostgreSQL.

Splitgraph allows the user to manipulate data images (snapshots of SQL tables at a given point in time) as if they were code repositories by versioning, pushing and pulling them.

Dolt https://github.com/liquidata-inc/dolt

07/04/2020

Dolt is a relational database, i.e. it has tables, and you can execute SQL queries against those tables. It also has version control primitives that operate at the level of table cell. Thus Dolt is a database that supports fine grained value-wise version control, where all changes to data and schema are stored in commit log.

Presidio https://github.com/microsoft/presidio

08/09/2019

Context aware, pluggable and customizable data protection and PII data anonymization service for text and images

librehosters https://libreho.st/

20/07/2019

librehosters is a network of cooperation and solidarity that uses free software to encourage decentralisation through federation and distributed platforms. Our values connect transparency, fairness and privacy with a culture of data portability and public contributions to the commons.

Kaitai Struct http://kaitai.io/

16/06/2019

A new way to develop parsers for binary structures.

Declarative: describe the very structure of the data, not how you read or write it

source{d} Engine https://sourced.tech/engine/

18/11/2018

Engineering managers and maintainers of large code bases are starting to realize the potential of Code as Data or how source code can be treated as an analyzable dataset proving valuable information. Think Business Intelligence and processes optimization based on the source code engineers write, rather than adjacent metrics.

pg_chameleon http://www.pgchameleon.org/

02/07/2018

pg_chameleon is a MySQL to PostgreSQL replica system written in Python 3. The tool can connect to the mysql replication protocol and replicate the data changes in PostgreSQL. Whether the user needs to setup a permanent replica between MySQL and PostgreSQL or perform an engine migration, pg_chamaleon is the perfect tool for the job.

pgLoader https://pgloader.io/

02/07/2018

pgloader loads data into PostgreSQL and allows you to implement Continuous Migration from your current database to PostgreSQL.

pgLoader has two modes of operation. It can either load data from files, such as CSV or Fixed-File Format; or migrate a whole database to PostgreSQL.

pgLoader supports several RDBMS solutions as a migration source, and fetches information from the catalog tables over a connection to then create an equivalent schema in PostgreSQL. This means that you can migrate to PostgreSQL in a single command-line!

Falcor http://netflix.github.io/falcor/

28/05/2018

Falcor is the innovative data platform that powers the Netflix UIs. Falcor allows you to model all your backend data as a single Virtual JSON object on your Node server. On the client you work with your remote JSON object using familiar JavaScript operations like get, set, and call. If you know your data, you know your API.

Falcor is middleware. It is not a replacement for your application server, database, or MVC framework. Instead Falcor can be used to optimize communication between the layers of a new or existing application.

Public APIs https://github.com/toddmotto/public-apis

22/04/2018

A collective list of free APIs for use in web development.

Links per page

Filters