Import CSV data with Flyway

In this tutorial, I will show you how to import CSV files the same way as Sql or Java migrations. This comes handy when you’re importing large data because doing it manually can become cumbersome.

Context

While I was working on Built In Africa, I found myself in need of importing CSV data as SQL migration so I thought of developing an extension to make it easier for me. It extends Flyway naming convention to be able to specify the target table name as part of the file name.

Flyway

Flyway is an open-source database migration tool licensed under Apache License 2.0. It implements automated and version-based database migrations and prevents database schema mismatch when you have multiple environments such as dev, test, and prod. One of the key advantages of Flyway is the automation of importing data and making it repeatable and predictable.

Guide

In order to use the extension, first you have to add jitpack.io as a new repository in your pom.xml file.

<repositories>
 <repository>
 <id>jitpack.io</id>
 <url>https://jitpack.io</url>
 </repository>
</repositories>

You then have to add the extension as a dependency to your project and change Tag to the version you want to use.

<dependency>
  <groupId>com.github.ramzimaalej</groupId>
  <artifactId>flyway-csv-extension</artifactId>
  <version>Tag</version>
</dependency>

Finally, add Flyway to your application’s configuration to import the CSV Migration Resolver.

flyway:
  datasources:
    default:
      enabled: true
      resolvers:
        - com.mytechden.flyway_csv.impl.resolver.CSVMigrationResolver

Naming convention

In order for the extension to work properly, follow the pattern defined below when naming your CSV files:

  • Prefix: V for versioned migrations, U for undo migrations, R for repeatable migrations
  • Version: Underscores (automatically replaced by dots at runtime) separate as many parts as you like (Not for repeatable migrations)
  • Separator: __ (two underscores)
  • Table: The name of the target table
  • Separator: __ (two underscores)
  • Description: Underscores (automatically replaced by spaces at runtime) separate the words

Example: V2__users__import_users

Types mapping

This extension uses Java built-in support to map field types to their corresponding column types. It supports UUID type and Null values.

UUID

In order to import CSV files that include UUIDs, you need to make sure the database column is of type BINARY(16) or equivalent.

Null values

All you have to do is to use null instead of an empty string in your csv files.

Demo micronaut app

I have built a Micronaut demo app and added the extension as a dependency. The whole working repository is available on GitHub.