Stop Azure Data Factory Data Flows outputting to multiple files

Summary

I came across a recent issue where I was trying to overwrite a single in my Data Flow. However, Data Factory started to output several intermediary files instead.

I had two datasets:

  1. src_json
  2. tgt_json

I wanted to overwrite the target file (account2.json) with the transformed source file, however I started receiving the output as multiple files (SUCCESS, Part-00000-tid, _committed etc.)

To write to a single file, do this:

In your SINK, go to Settings, select ‘Output to a single file‘ and then select the file name.

This should result in a single file being overwritten.

Leave a comment