Apache Avro
| Developer(s) | Apache Software Foundation | 
|---|---|
| Initial release | 2 November 2009 | 
| Stable release | 1.11.3
   / September 23, 2023 | 
| Repository | Avro Repository | 
| Written in | Java, C, C++, C#, Perl, Python, PHP, Ruby | 
| Type | Remote procedure call framework | 
| License | Apache License 2.0 | 
| Website | avro | 
Avro is a row-oriented remote procedure call and data serialization framework developed within Apache's Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. Its primary use is in Apache Hadoop, where it can provide both a serialization format for persistent data, and a wire format for communication between Hadoop nodes, and from client programs to the Hadoop services. Avro uses a schema to structure the data that is being encoded. It has two different types of schema languages: one for human editing (Avro IDL) and another which is more machine-readable based on JSON.
It is similar to Thrift and Protocol Buffers, but does not require running a code-generation program when a schema changes (unless desired for statically-typed languages).
Apache Spark SQL can access Avro as a data source.