Oligo Datasets

The oligo datasets currently host oligo sequences of 9 species, including six plant species (Arabidopsis, rice, maize, potato, barley and soybean), two mammals (human and mouse) and a model species zebrafish. The details of the oligos are showed in the bellow table.
To download the oligos, please click the download link of the speices. Details about Chorus2, which was used to design the oligos can be found here.

Information table

Species Scientific name Reference No. of oligos Download1 Download2
Arabidopsis Arabidopsis thaliana TAIR10 1,089,367 bed bed
Rice Oryza sativa TIGR7 1,710,279 bed bed
Maize Zea mays B73_v4 1,641,770 bed bed
Potato Solanum tuberosum DM_v4.04 1,674,902 bed bed
Barley Hordeum vulgare IBSC_v2 2,968,560 bed bed
Soybean Glycine max Gmax_ZH13_v2.0 1,851,420 bed bed
Human Homo sapiens hg38 2,311,370 bed bed
Mouse Mus musculus mm10 3,397,774 bed bed
Zebrafish Danio rerio danRer11 2,442,308 bed bed

Images

Arabidopsis Rice Maize
Potato Barley Soybean
Human Mouse Zebrafish


How to use the datasets

To use the oligo sequences of the target species, users should first download the bed file from the download column in the above table.
Oligo sequences are provided with bed.bgz format, which is a compressed version of bed file. Users can decompress the file following the below instructions:
For Windows Users:
Download 7-Zip software and install.
img
Use 7-Zip to uncompress bed.bgz file.
img
For Linux/MacOS Users:
Using the following command to uncompress bed.bgz file:
$ gzip -cd xxx.bed.bgz > xxx.bed.
img

The decompressed bed file can be opened and read by text editor or Excel easily.
For Windows users, text editor (Such as EditPlus) is the optimum choice to open it.
img

The bed file contains six columns which are separated by delimiter, just like this:
Chr1 1360 1404 AAGATAGAGAACAAGAGAGTGAGAGGATAAGGATATAGACCAGAC 2841 +
Each column represents chromosome, oligo start site, oligo end site, oligo probe sequence, k-mer score and target strand of probes, respectively.
Windows Users can use the filter function of Excel to select the target oligos.
For Linux/MacOS users, awk or perl command may be a better method to select the desired oligo sequences. Just like this:
awk '$1=="Chr1"&&$2>=100000&&$3<=200000' TIGR7.bed
This command will extract oligos in the region Chr1:100000-200000 in rice.
img

Finally, oligo sequences in the fourth column of the bed file can be synthesized directly for oligo-FISH experiments.



Chorus2

Chorus2 is a software which is developed to design genome-scale oligonucleotide-based probes for fluorescence in situ hybridization (FISH).
Chorus2 uses python script Chorus2.py to identify and pre-filter oligos. It is implemented with a “k-mer score” method to remove repetitive oligos of target genome. Chorus2 run fast and can handle large genome like wheat. The oligos designed by Chorus2 has high specificity and suitable for FISH. The Chorus2 package runs on Linux, macOS and Windows with flexible command-line or an easy-to-use GUI (graphical user interface).

Citation

Zhang T†,*, Liu GQ†, Zhao HN, Braz G.T, Jiang JM*. Chorus2: design of genome-scale oligonucleotide-based probes for fluorescence in situ hybridization Plant Biotechnology Journal 2021, 19(10):1967-1978

* If there are any questions when using Chorus2 or our oligo datasets, please contact us (hanyangshuo@zhangtaolab.org or liuguanqing@zhangtaolab.org).