🤔What’s TCC?

TCC LOGO

TCC^[1] is a R/Bioconductor package provides a series of functions for performing differential expression (DE) analysis from RNA-seq count data using a robust normalization strategy (called DEGES).

The basic idea of DEGES is that potential differentially expressed genes (DEGs) among compared samples should be removed before data normalization to obtain a well-ranked gene list where true DEGs are top-ranked and non-DEGs are bottom ranked. This can be done by performing the multi-step normalization procedures based on DEGES (DEG elimination strategy) implemented in TCC.

TCC internally uses functions provided by edgeR^[2], DESeq^[3], DESeq2^[4], and baySeq^[5] . The multi-step normalization of TCC can be done by using functions in the four packages.

🔬TCC-GUI: Graphical User Interface for TCC package

In this GUI version of TCC (TCC-GUI), all parameter settings are available just like you are using the original one. Besides, it also provides lots of plotting functions where the original package is unsupported now.

🛠Function

Generalization of Simulation data .
Dataset summarization and sample distribution plot for sample quality control.
Detection of differentially expressed genes (DEGs).
Interactive visualization of MA plot, Volcano plot, expression level plot and so on.
PCA and heatmap analysis (clustering included).
Output result in table, figure, code or report (.md, .pdf) (Under developing).

Please check other tab in Guidance for details.

📧Contact

If you have any question, comment or advise about the application, please contact📧suwei(at)bi.a.u-tokyo.ac.jp or 📧kadota(at)bi.a.u-tokyo.ac.jp.

Also, you can access 🔗Github and open a new issue for bug report or function requirement (you can write in English, Chinese or Japanese as you like).

📚References

[1] Sun J, Nishiyama T, Shimizu K, et al. TCC: an R package for comparing tag count data with robust normalization strategies. BMC bioinformatics, 2013, 14(1): 219.
[2] Robinson M D, McCarthy D J, Smyth G K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 2010, 26(1): 139-140.
[3] Anders S, Huber W. Differential expression analysis for sequence count data. Genome biology, 2010, 11(10): R106.
[4] Love M I, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome biology, 2014, 15(12): 550.
[5] Hardcastle T J, Kelly K A. baySeq : empirical Bayesian methods for identifying differential expression in sequence count data. BMC bioinformatics, 2010, 11(1): 422.

🔗Emoji icons supplied by EmojiOne

Simulation Data

In Simulation Data, the GUI version of simulateReadCounts function to generate simulation data generated based on various parameters will be used here. The main interface for “Simulation Data”

You can set the random seed, number of genes (N_gene), Proportion of DEGs (P_DEG) and the number of groups (N_group). According to the specific N_group value, tabs in the Group parameters will change in order to ensure the consistency. The caption of number of DEGs at the bottom of this panel and the Summary panel will keep real-time updating based on all parameters to speed-up the analysis.

Besides N_gene, P_DEG and N_group, other information will be shown in Summary panel are:

P_Gi: The assignment of DEGs in group i (i = 1, 2… N_group).
FC_Gi: Degree of fold-change in group i (i = 1, 2… N_group).
NR_Gi: Number of replicates in group i (i = 1, 2… N_group).

After the Generate Simulation Data button is clicked, the dataset will be generated within several seconds based on all parameters and the result will be displayed in the Simulation Data panel and the values will be colored from white to dark blue according to the expression values from low to high. Users can download the dataset to their local machine or just click the “Exploratory Analysis (Step 1)” to conduct various analysis directly.

Steps for data import

First, click Data Import (Step 1) tab in the side bar on the left of this page.
At the left of the top, you can [Import Data] just for a test, or click [Upload] data tab for uploading your own tab-delimited text file like hypodata.txt. Please make sure it’s a original count data file. If you are going to upload large dataset (such as file in 50,000 rows when using online version TCC-GUI), please wait until the file has been uploaded completely. In this case, offline version is highly recommended.

Data will be shown after loaded.
After the dataset loaded, input your grouping into the [Group Selection] panel.
- First column is your sample name (the same as the column’s name of your input dataset), only the columns which are listed at here will be included into analysis.
- Second column is your grouping name (such as “control” or “sample”)
Click [Confirmed] button, wait for a while, and the [Summary of Data] and [Sample Distribution] will show more information of your dataset. You can modify the plot and save them for the purpose of studying or publishing.

Steps for calculation

Click [Calculation] tab in the side bar on the left of this page.
You can change all the parameters of TCC calculation or just leave it as default. Click [Run TCC Calculation] button and wait for calculation finished. Depends on your size of dataset, method you have chosen and the iteration number, it will take several seconds to 2 minutes for calculation (WAD < voom < Others ).
After calculation, [Result Table] will show on the right of the page. [Sample Distribution] of before normalization and after will be drawn simultaneously.

Besides, you can copy and save the R code of TCC calculation (under [TCC Parameters] panel) in the purpose of code studying or reproducing the same results on a local machine.
Next, Step3 & Step4 tabs will show up in the side bar. Step3 is for data exploration, visualization and analysis while Step4 is for outputs. you can choose any of them for the next step of your analysis.

🤔What’s MA plot?

An MA plot is an application of a Bland–Altman plot for visual representation of genomic data. The plot visualizes the differences between measurements taken in two samples, by transforming the data onto M (log ratio) and A (mean average) scales, then plotting these values. Though originally applied in the context of two channel DNA microarray gene expression data, MA plots are also used to visualize high-throughput sequencing analysis (quote from 🔗wikipedia).

Steps for plotting

Change parameters or just leave it as default, click [Generate MA-Plot] button, and the MA plot will show up in the middle of the page.
If you want to check the infomation of specific point (transcript or gene), hover your cursor on the point, and the additional information will be print out (on the right side of the page, a expression level plot will also be provided).
If you want to mark some genes on the plot, please click the specific rows of the gene in Result Table panel, and click [Generate MA-Plot] button again to refresh the plot.
On the left of the page, you will see lots of different FDR cut off and the count of DEGs in FDR vs DEGs panel.
On the right of the page, R code of MA plot is also provided.

🤔What’s Volcano Plot?

In statistics, a volcano plot is a type of scatter-plot that is used to quickly identify changes in large data sets composed of replicate data. It plots significance versus fold-change on the y and x axes, respectively. These plots are increasingly common in omic experiments such as genomics, proteomics, and metabolomics where one often has a list of many thousands of replicate data points between two conditions and one wishes to quickly identify the most meaningful changes (quote from 🔗wikipedia).

Steps for plotting

Almost the same as [MA Plot]. Please check the document of MA Plot.

volcanoplot parameters

You can select genes by two method, By list and By FDR.

By list: Input a list of genes’ name those you want it to be shown in heatmap.
By FDR: Use FDR cut off and the top N genes to generate heatmap.

Change the parameters for heatmap (if you wish).
Click the [Run heatmap] button, and the result will show up on the right.
How to change the color in heatmap? In Choose colormap, select a color map and the number of colors you prefer to use in the heatmap.

color map

Select gene(s) you want to plot in the left panel. Barplot and Boxplot are provided.
expression parameters

expression

Logs function

Coming soon.

Report function

Coming soon.

More helps

Install Original TCC package from Bioconductor

Pipeline of TCC-GUI

1. Data input
2. Computation

3.1 MA plot
3.2 Volcano plot
3.3 PCA analysis
3.4 Heatmap
3.5 Expression

4. More helps

English

TCC-GUI

TCC: Differential expression analysis for tag count data with robust normalization strategies

TCC LOGO

This package provides a series of functions for performing differential expression analysis from RNA-seq count data using robust normalization strategy (called DEGES). The basic idea of DEGES is that potential differentially expressed genes or transcripts (DEGs) among compared samples should be removed before data normalization to obtain a well-ranked gene list where true DEGs are top-ranked and non-DEGs are bottom ranked. This can be done by performing a multi-step normalization strategy (called DEGES for DEG elimination strategy). A major characteristic of TCC is to provide the robust normalization methods for several kinds of count data (two-group with or without replicates, multi-group/multi-factor, and so on) by virtue of the use of combinations of functions in depended packages.

Author: Jianqiang Sun, Tomoaki Nishiyama, Kentaro Shimizu, and Koji Kadota

GUI Version Developer: Wei Su

1. Data input

Click “Computation” tab on the top;
Click “Load Sample Data” button for test or Click “Upload…” button if you want to used your own count dataset.

2. Computation

Select “Group Count” and click “Confirmed” button;
Select columns name of your dataset for grouping and click “Confirmed”, then the “TCC Parameters” Panel will show up;
Change the parameters for computation (if you wish), and click the “Run TCC”.
After computation, “Result Table”, and series tab for other analysis will show up.

3.1 MA plot

After computation, switch to “MA Plot” tab, and click “Generate MA-Plot” button;
Hover cursor on the point, and the additional information will be provided (Gene expression plot).
If you want to mark some gene on the plot, please click the specific rows of gene, and click **“Generate MA-Plot” **button again to refresh the plot.
(This is prerelease version. more functions need to be add in)

3.2 Volcano plot

Same as part of MA Plot.
(This is prerelease version. more functions need to be add in)

3.3 PCA analysis

Change the parameters for PCA (if you wish), and click the “Run” button.
(This is prerelease version. more functions need to be add in)

3.4 Heatmap

Change the parameters for heatmap (if you wish), and click the “Run” button.
(This is prerelease version. more functions need to be add in)

3.5 Expression

Select gene(s) you want to plot in the left panel. Barplot and Boxplot are provided.
(This is prerelease version. more functions need to be add in)

4. More helps

Install Original TCC package from Bioconductor

Data Simulation Parameters

Set Random Seed

Number of Genes (N_gene)

Proportion of DEGs (P_DEG)

Number of Groups (N_group)

Only support single-factor experimental design now.

Summary

Group Parameters

1. The 10000 genes dataset hypoData is simulation data generated by TCC::simulateReadCounts function

2. After performing simulation in the Step0 , Simulation Data can be selected and it's referring the latest simulation result.

Upload Count Data

Upload...

Text file in .tsv/.csv format, and the first column should be genes' name.

Group Assignment

Input your group info

TCC-GUI expect first label should be Group1 (G1) and the next be Group2 (G2), and so on.

Summary

Read Count Table

TCC Computation Parameters

Normalization Method

DEG Identification Method

Filtering Threshold for Low Count Genes

Number of Iteration

FDR Cut-off

Elimination of Potential DEGs

TCC Computation Code

Result Table

Summary of TCC Normalization

MA Plot Parameters

Table
Plot
FDR vs DEGs

MA Plot Code

MA Plot

Result Table

Volcano Plot Parameters

Volcano Plot Code

Volcano Plot

Result Table

Heatmap Parameters

Heatmap R Code

Heatmap

It will be very time consuming if the number of genes is over hundred. Reduce the number by cutoff or wait patiently.

Listed Gene Information Table

Expression Level Parameters

Expression Level R Code

Barplot
Boxplot

Barplot
Boxplot
Expression Level

Expression Table
Result Table
Table of Expression Level

Report Parameters

Document Format

HTML report (default) is highly recommended.

TCC Algorithm Published 2013-07-09

Source Code Update at 2019-02-08

About Us

🤔What’s TCC?

🔬TCC-GUI: Graphical User Interface for TCC package

🛠Function

📧Contact

📚References

Simulation Data

Steps for data import

Steps for calculation

🤔What’s MA plot?

Steps for plotting

🤔What’s Volcano Plot?

Steps for plotting

Logs function

Report function

More helps

Pipeline of TCC-GUI

TCC-GUI

1. Data input

2. Computation

3.1 MA plot

3.2 Volcano plot

3.3 PCA analysis

3.4 Heatmap

3.5 Expression

4. More helps

Data Simulation Parameters

Summary

Group Parameters

Simulation Data

Group Assignment

Summary

Read Count Table

TCC Computation Parameters

TCC Computation Code

Result Table

Summary of TCC Normalization

MA Plot Parameters

MA Plot Code

MA Plot

Result Table

Volcano Plot Parameters

Volcano Plot Code

Volcano Plot

Result Table

Heatmap Parameters

Heatmap R Code

Heatmap

Listed Gene Information Table

Expression Level Parameters

Expression Level R Code

Report Parameters

Output Option