Topological sorting of gff features

It might be critical to have a GFF/GTF file properly sorted:

  • Not properly sorted, a genome browser can bug or give wrong displays

  • Some tools require files sorted in a particular way (e.g.tabix tool from htslib need a GFF sorted by chromosomes and positions).

  • It makes it easy to ready for the human eye

Zhigang Lu has made a nice post about his experience trying to find a way to get a correct topological sorting. See here.

Table of Contents

Tests summary

tool option in command line Type of sorting Comment
AGAT --tabix by chromosomes, by gene position, by type (mRNAs then exon, then CDS then alphabetical feature types; then mRNA2 then exon2, then CDS2 then alphabetical feature2 types) Fix GFF/GTF if needed
GenomeTools -sortlines -tidy -retainids by chromosomes and positions then random feature type Lines with the same chromosomes and start positions would be placed randomly, so parent feature lines might sometimes be placed after their children lines.
GenomeTools -retainids by chromosomes, by gene position, by type (mRNA then children; then mRNA2 then children2), by position (children are sorted by positions)
GFF3sort --precise by chromosomes and positions then attribute with Parent attribute first. move lines with "Parent=" attributes (case insensitive) behind lines without "Parent=" attributes. The goal of GFF3sort is not to obtain a topological sorting but rather getting something that could be indexed optimally by third part tools.
gffread By default, chromosomes are kept in the order they were found. With --sort-alpha parameter the chromosomes (reference sequences) are sorted alphabetically /!\ Some feature types are lost e.g. gene, three_prime_UTR, five_prime_UTR, etc...

Example 1

This test is based on the file used by Zhigang Lu

The GFF file to sort

##gff-version 3
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	gene	103403	151162	0.12	-	.	ID=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.02	-	.	ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.1	-	.	ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.93	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.96	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	105920	106144	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	105920	106144	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	106876	107159	0.93	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	106876	107159	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	140582	140849	0.85	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	140582	140849	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.2

Results

AGAT

AGAT v1.0.0

  • default sorting

agat_convert_sp_gxf2gxf.pl --gff test.gff

##gff-version 3
SM_V7_1	AUGUSTUS	gene	103403	151162	0.12	-	.	ID=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.02	-	.	ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	ID=exon-1;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	ID=exon-6;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	ID=exon-8;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	ID=exon-10;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.93	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	ID=five_prime_utr-1;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	ID=three_prime_utr-1;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.1	-	.	ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	ID=exon-2;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	105920	106144	.	-	.	ID=exon-3;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	106876	107159	.	-	.	ID=exon-4;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	140582	140849	.	-	.	ID=exon-5;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	ID=exon-7;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	ID=exon-9;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	ID=exon-11;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.96	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	105920	106144	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	106876	107159	0.93	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	140582	140849	0.85	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	ID=five_prime_utr-2;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	ID=three_prime_utr-2;Parent=Smp_315690.2
  • Tabix sorting

agat config --expose --tabix agat_convert_sp_gxf2gxf.pl --gff test.gff

##gff-version 3
SM_V7_1 AUGUSTUS  gene  103403  151162  0.12  - . ID=Smp_315690
SM_V7_1 AUGUSTUS  mRNA  103403  151162  0.02  - . ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1 AUGUSTUS  mRNA  103403  151162  0.1 - . ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1 AUGUSTUS  exon  103403  103770  . - . ID=exon-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  three_prime_UTR 103403  103440  . - . ID=three_prime_utr-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  103403  103770  . - . ID=exon-2;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  three_prime_UTR 103403  103440  . - . ID=three_prime_utr-2;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 103441  103770  0.93  - 0 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 103441  103770  0.96  - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  105920  106144  . - . ID=exon-3;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 105920  106144  1 - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  106876  107159  . - . ID=exon-4;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 106876  107159  0.93  - 2 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  140582  140849  . - . ID=exon-5;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 140582  140849  0.85  - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  142981  143205  . - . ID=exon-6;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 142981  143205  1 - 0 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  142981  143205  . - . ID=exon-7;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 142981  143205  1 - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  145395  145678  . - . ID=exon-8;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 145395  145678  1 - 2 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  145395  145678  . - . ID=exon-9;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 145395  145678  1 - 2 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  exon  151075  151162  . - . ID=exon-10;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  CDS 151075  151132  1 - 0 ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  exon  151075  151162  . - . ID=exon-11;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  CDS 151075  151132  1 - 0 ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1 AUGUSTUS  five_prime_UTR  151133  151162  . - . ID=five_prime_utr-1;Parent=Smp_315690.1
SM_V7_1 AUGUSTUS  five_prime_UTR  151133  151162  . - . ID=five_prime_utr-2;Parent=Smp_315690.2
GenomeTools

GenomeTools 1.6.1

gt gff3 -sortlines -tidy -retainids test.gff

##gff-version 3
##sequence-region   SM_V7_1 103403 151162
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	gene	103403	151162	0.12	-	.	ID=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.02	-	.	ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.1	-	.	ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.96	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.93	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	105920	106144	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	105920	106144	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	106876	107159	0.93	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	106876	107159	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	140582	140849	0.85	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	140582	140849	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.2

gt gff3 -retainids test.gff

##gff-version 3
##sequence-region   SM_V7_1 103403 151162
SM_V7_1	AUGUSTUS	gene	103403	151162	0.12	-	.	ID=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.02	-	.	ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.93	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.1	-	.	ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.96	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	105920	106144	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	105920	106144	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	106876	107159	0.93	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	106876	107159	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	140582	140849	0.85	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	140582	140849	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.2
###
GFF3sort

GFF3sort 0.1.a1a2bc9

gff3sort.pl --precise test.gff

##gff-version 3
SM_V7_1	AUGUSTUS	gene	103403	151162	0.12	-	.	ID=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.02	-	.	ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.1	-	.	ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.93	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.96	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	105920	106144	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	105920	106144	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	106876	107159	0.93	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	106876	107159	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	140582	140849	0.85	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	140582	140849	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.2
gffread

gffread v0.11.4

gffread test.gff

# gffread test.gff
# gffread v0.11.4
##gff-version 3
SM_V7_1	AUGUSTUS	mRNA	103403	151162	.	-	.	ID=Smp_315690.1;geneID=Smp_315690
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	103441	103770	.	-	0	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	142981	143205	.	-	0	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	145395	145678	.	-	2	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	.	-	0	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	mRNA	103403	151162	.	-	.	ID=Smp_315690.2;geneID=Smp_315690
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	105920	106144	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	106876	107159	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	140582	140849	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	103441	103770	.	-	0	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	105920	106144	.	-	0	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	106876	107159	.	-	2	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	140582	140849	.	-	0	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	.	-	0	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	145395	145678	.	-	2	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	151075	151132	.	-	0	Parent=Smp_315690.2

Example 2

This test is based on the file used by GFF3sort

The GFF file to sort

##gff-version 3
###
A01	Cufflinks	mRNA	473	6154	.	-	.	ID=XLOC_001154.41;description=Novel: Intergenic transcript
A01	Cufflinks	exon	473	814	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	1626	2574	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	2695	2721	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	5329	5408	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	5994	6154	.	-	.	Parent=XLOC_001154.41
###
A01	Cufflinks	mRNA	473	6386	.	-	.	ID=XLOC_001154.42;description=Novel: Intergenic transcript
A01	Cufflinks	exon	473	2024	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	2615	2721	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	5329	6386	.	-	.	Parent=XLOC_001154.42

Results

AGAT

AGAT v0.9.0

  • default sorting

agat_convert_sp_gxf2gxf.pl --gff test2.gff --merge_loci

##gff-version 3
###
A01	Cufflinks	gene	473	6386	.	-	.	ID=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01	Cufflinks	mRNA	473	6154	.	-	.	ID=XLOC_001154.41;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01	Cufflinks	exon	473	814	.	-	.	ID=exon-1;Parent=XLOC_001154.41
A01	Cufflinks	exon	1626	2574	.	-	.	ID=exon-2;Parent=XLOC_001154.41
A01	Cufflinks	exon	2695	2721	.	-	.	ID=exon-3;Parent=XLOC_001154.41
A01	Cufflinks	exon	3637	3726	.	-	.	ID=exon-4;Parent=XLOC_001154.41
A01	Cufflinks	exon	5329	5408	.	-	.	ID=exon-5;Parent=XLOC_001154.41
A01	Cufflinks	exon	5994	6154	.	-	.	ID=exon-6;Parent=XLOC_001154.41
A01	Cufflinks	mRNA	473	6386	.	-	.	ID=XLOC_001154.42;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01	Cufflinks	exon	473	2024	.	-	.	ID=exon-7;Parent=XLOC_001154.42
A01	Cufflinks	exon	2615	2721	.	-	.	ID=exon-8;Parent=XLOC_001154.42
A01	Cufflinks	exon	3637	3726	.	-	.	ID=exon-9;Parent=XLOC_001154.42
A01	Cufflinks	exon	5329	6386	.	-	.	ID=exon-10;Parent=XLOC_001154.42
  • Tabix sorting

agat_convert_sp_gxf2gxf.pl --gff test2.gff --merge_loci --tabix

##gff-version 3
###
A01 Cufflinks gene  473 6386  . - . ID=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks mRNA  473 6154  . - . ID=XLOC_001154.41;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks mRNA  473 6386  . - . ID=XLOC_001154.42;Parent=nbisL1-mrna-1;description=Novel: Intergenic transcript
A01 Cufflinks exon  473 814 . - . ID=exon-1;Parent=XLOC_001154.41
A01 Cufflinks exon  473 2024  . - . ID=exon-7;Parent=XLOC_001154.42
A01 Cufflinks exon  1626  2574  . - . ID=exon-2;Parent=XLOC_001154.41
A01 Cufflinks exon  2615  2721  . - . ID=exon-8;Parent=XLOC_001154.42
A01 Cufflinks exon  2695  2721  . - . ID=exon-3;Parent=XLOC_001154.41
A01 Cufflinks exon  3637  3726  . - . ID=exon-4;Parent=XLOC_001154.41
A01 Cufflinks exon  3637  3726  . - . ID=exon-9;Parent=XLOC_001154.42
A01 Cufflinks exon  5329  5408  . - . ID=exon-5;Parent=XLOC_001154.41
A01 Cufflinks exon  5329  6386  . - . ID=exon-10;Parent=XLOC_001154.42
A01 Cufflinks exon  5994  6154  . - . ID=exon-6;Parent=XLOC_001154.41
GenomeTools

GenomeTools 1.6.1

gt gff3 -sortlines -tidy -retainids test2.gff

##gff-version 3
##sequence-region   A01 473 6386
A01	Cufflinks	exon	473	814	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	473	2024	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	mRNA	473	6154	.	-	.	ID=XLOC_001154.41;description=Novel: Intergenic transcript
A01	Cufflinks	mRNA	473	6386	.	-	.	ID=XLOC_001154.42;description=Novel: Intergenic transcript
A01	Cufflinks	exon	1626	2574	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	2615	2721	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	2695	2721	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	5329	5408	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	5329	6386	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	5994	6154	.	-	.	Parent=XLOC_001154.41
###

gt gff3 -retainids test2.gff

##gff-version 3
##sequence-region   SM_V7_1 103403 151162
SM_V7_1	AUGUSTUS	gene	103403	151162	0.12	-	.	ID=Smp_315690
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.02	-	.	ID=Smp_315690.1;Parent=Smp_315690
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.93	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.1.cds;Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.1
SM_V7_1	AUGUSTUS	mRNA	103403	151162	0.1	-	.	ID=Smp_315690.2;Parent=Smp_315690
SM_V7_1	AUGUSTUS	three_prime_UTR	103403	103440	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	103403	103770	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	103441	103770	0.96	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	105920	106144	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	105920	106144	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	106876	107159	0.93	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	106876	107159	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	140582	140849	0.85	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	140582	140849	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	142981	143205	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	142981	143205	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	145395	145678	1	-	2	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	145395	145678	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	CDS	151075	151132	1	-	0	ID=Smp_315690.2.cds;Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	exon	151075	151162	.	-	.	Parent=Smp_315690.2
SM_V7_1	AUGUSTUS	five_prime_UTR	151133	151162	.	-	.	Parent=Smp_315690.2
###
GFF3sort

GFF3sort 0.1.a1a2bc9

gff3sort.pl --precise test2.gff

##gff-version 3
A01	Cufflinks	mRNA	473	6154	.	-	.	ID=XLOC_001154.41;description=Novel: Intergenic transcript
A01	Cufflinks	mRNA	473	6386	.	-	.	ID=XLOC_001154.42;description=Novel: Intergenic transcript
A01	Cufflinks	exon	473	2024	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	473	814	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	1626	2574	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	2615	2721	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	2695	2721	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	5329	6386	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	5329	5408	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	5994	6154	.	-	.	Parent=XLOC_001154.41
gffread

gffread v0.11.4

gffread test2.gff

# gffread test2.gff
# gffread v0.11.4
##gff-version 3
A01	Cufflinks	mRNA	473	6154	.	-	.	ID=XLOC_001154.41
A01	Cufflinks	exon	473	814	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	1626	2574	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	2695	2721	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	5329	5408	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	exon	5994	6154	.	-	.	Parent=XLOC_001154.41
A01	Cufflinks	mRNA	473	6386	.	-	.	ID=XLOC_001154.42
A01	Cufflinks	exon	473	2024	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	2615	2721	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	3637	3726	.	-	.	Parent=XLOC_001154.42
A01	Cufflinks	exon	5329	6386	.	-	.	Parent=XLOC_001154.42