This format lends itself to rapid searching in an analo gous fashion to FASTA formatted sequence databases. In contrast to the KS query score scheme, which requires generating random reference gene list data, we adopted a simple regression scoring scheme with corresponding statistic. Searches Volasertib cancer can be performed on a standard desk top PC and take 10 minutes per query. Although, the present database consisting of expression data for over 100,000 samples from five platforms covering three spe cies is all from Affymetrix expression array chips, the methodology is truly platform independent and it is a straight forward matter to include data based on other array technologies. Other species and platform technologies will be added to SPIED in the future.
For the present study Affymetrix was chosen because of the relatively large number of available sam ples. Further details are presented in the methods section below. Results Drug treatment based profile SPIED queries The CMAP contains expression change profiles as ranked array probe IDs for 6,100 individual treatments corresponding to 1,309 distinct drug like compounds. Statistically filtered response profiles can be defined for 1,218 of the drugs as these have at least three instances in the database. The profiles can be mapped onto a non redundant gene list by uniquely associating one probe ID to a given gene and dropping the other probe ID for this gene with less robust expression changes over the data base. This is the same methodology underlying the SPIED database.
We took the responder profiles for the 1,218 drugs and searched the SPIED for maximally correlated expression change profiles. The objective is to see to what extent the CMAP transcriptional signatures correlate with transcriptional responses assimilated within our platform independent database of over 100,000 microarrays deposited by a very large number of groups to the public domain. The CMAP is well populated with drugs that target the same or different steps in the PI3K mTOR signalling cascade. In this context the results for LY 294002, rapamycin and wortmannin showed a high degree of overlap, see additional file 1 for the full fold change data. It is a straightforward matter to query the SPIED with these drug expression profiles. This is done by calculating the regression scores against the individual SPIED entries and retaining the top 100 correlations, see Methods for details.
For simplicity and uniformity of treatment, unless otherwise stated, we query SPIED with expression profiles containing 500 genes with the largest fold values passing the p 0. 05 significance threshold. Brefeldin_A It should be noted that results will be largely insensitive to the size of the query profile. The top SPIED correlate for all three drugs was the Pan PI3K inhibitor GDC 0941 treated T47D breast can cer cells and the regression scores for the tree query sig natures against all 6 samples in the series are shown in Figure 1A.