Indian Journal of Cancer
Home  ICS  Feedback Subscribe Top cited articles Login 
Users Online :3109
Small font sizeDefault font sizeIncrease font size
Navigate here
Resource links
   Similar in PUBMED
   Search Pubmed for
   Search in Google Scholar for
 Related articles
   Article in PDF (1,230 KB)
   Citation Manager
   Access Statistics
   Reader Comments
   Email Alert *
   Add to My List *
* Registration required (free)  

  In this article
   Mutations Harbor...
   Bioinformatic An...
   PPI Studies And ...
   Article Figures
   Article Tables

 Article Access Statistics
    PDF Downloaded1145    
    Comments [Add]    
    Cited by others 3    

Recommend this journal


  Table of Contents  
Year : 2016  |  Volume : 53  |  Issue : 1  |  Page : 1-7

Next generation sequencing analysis of lung cancer datasets: A functional genomics perspective

1 Bioinformatics and Systems Biology, Bioclues Organization, IKP Knowledge Park, Secunderabad, Andhra Pradesh, India
2 Institute of Computer Science, University of Tartu, Tartu, Estonia, Europe

Date of Web Publication28-Apr-2016

Correspondence Address:
P Suravajhala
Bioinformatics and Systems Biology, Bioclues Organization, IKP Knowledge Park, Secunderabad, Andhra Pradesh
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/0019-509X.180832

Rights and Permissions


Cigarette smoking leads to serious epidemics in humans, creating torsion of infection in epithelial cells lining the respiratory tracts. Several researchers in the recent past have theorized that the next generation sequencing (NGS), especially transcriptome sequencing has enhanced understanding lung cancers and other epithelial epidemics. Conversely, pathogenesis specific to lung cancer with respect to molecular fraction of genomic ribonucleic acid has some mutant effect in various populations like smokers with lung cancer, healthy never smokers and vice versa. We review the impending impact of NGS data while providing insights into the biology of lung cancer affecting various populations, which we believe would be an add-on service for predictive biology approaches. Furthermore, we conclude what would be the outcome of such analysis for Indian population. Bioinformatics analysis was performed using various tools. We identified five genes namely epidermal growth factor receptor, Kirsten rat sarcoma, adenomatosis polyposis down regulated-1, N-ethylmaleimide-sensitive factor attachment protein, gamma and Piezo type mechanosensitive ion channel component 2 whose role was implicated in lung cancer and further analysis has to be performed to check whether or not the genes are indeed completely involved in causing lung cancer.

Keywords: Functional genomics, next generation sequencing, lung cancer

How to cite this article:
Pappu P, Madduru D, Chandrasekharan M, Modhukur V, Nallapeta S, Suravajhala P. Next generation sequencing analysis of lung cancer datasets: A functional genomics perspective. Indian J Cancer 2016;53:1-7

How to cite this URL:
Pappu P, Madduru D, Chandrasekharan M, Modhukur V, Nallapeta S, Suravajhala P. Next generation sequencing analysis of lung cancer datasets: A functional genomics perspective. Indian J Cancer [serial online] 2016 [cited 2022 Aug 10];53:1-7. Available from:

Pappu P, Madduru D.
Equal contributing authors.

  Introduction Top

Lung carcinoma is characterized by uncontrolled cell growth in the lung tissue. Being one of the major causes of deaths world-wide, it is associated with exposure to synthetics and chemicals (e.g. tobacco associated carcinogens and benzene family compounds), heavy metals (as radium, cadmium and arsenic), radiation including radioactive (and non-radioactive, naturally occurring microbial agents like certain aflatoxins and viruses (Rous sarcoma, Hepatitis Band human papilloma), genotoxic agents, cigarette smoke and others (Diesel exhaust and passive smoking).[1],[2] All these Carcinogens, especially covalent carcinogen-deoxyribonucleic acid (DNA) adduct may lead to base misincorporation leading to mutations, which finally results in lung cancer (Lung Carcinomas: Reviewed in [Table 1]).[3]
Table 1: Types of lung cancers and their mode of classification

Click here to view

Of all, lung cancer is caused mostly by cigarette smoking; however there have been cases reported that nonsmokers are also prone to cancers.[4] According to US geographical report, men and women smokers are likely to develop lung cancer, in the ratio of 23:13 times more respectively when compared with non-smokers.[5] In India, there are more specific and relevant cases wherein, the incidence of squamous cell carcinoma of the lung was found to be the most common among the smokers, whereas in non-smoking women, it has been found that adenocarcinoma was more prevalent corresponding to histology, demographic factors, etc., among ubiquitous factors when taken into consideration.[6] Questions such as whether or not passive inhalation in nonsmokers is associated with lung cancer remains to be answered to know what exactly might be contributing to lung cancer.Although mutations in genes stimulating cell growth (Kirsten rat sarcoma [KRAS], MYC) result in abnormalities in growth factor receptor signaling (epidermal growth factor receptor [EGFR], human epidermal growth factor receptor 2 [HER2]/neu), they also inhibit apoptosis (B-cell lymphoma 2) and contribute to the proliferation of abnormal cells 2. In this process, the mutations in proto-oncogenes and tumor suppressor genes also results in cancer, while several commonly known lung cancer genes, viz. EGFR, KRAS, AKT1, BRAF, MET, ERBB2, SKT11, MYC, MYCL, MYCN, NRAS, PIK3CA, PTEN, CDKN2A, RB1, TP53 and EML4-ALK fusion affect the cause. Recent studies have reported that some never smoked individuals with lung cancer have genetic mutations in the EGFR gene.[7] A few lung cancer genes, significantly involved in regulation have been reviewed here and types of lung cancers summarized in [Table 1].

Adenomatosis polyposis coli down regulated-1 (APCDD1) is a 514 AA chain homodimer interacting with LRP5 and WNT3A and is a negative regulator of the Wnt signaling pathway.[8] It is down regulated by the wild type adenomatosis polyposis coli protein. Via its interaction with WNT and LRPO proteins, it may play a role in colorectal tumorigenesis though evidence of it causing lung cancer is yet to be determined.[9] APCDD1 is also known to be the target gene of the Wnt/Beta catenin pathway transcriptionally regulated by the CTNNB1/TF7 L2 complex, with two potential TCF4 binding sites.[10],[11] Hereditary hypertrichosis simplex is caused due to mutations in this gene, whereas colorectal cancer and lung cancer might be caused if there is over expression of the gene.[12]

N-ethylmaleimide-sensitive factor attachment protein, gamma (NAPG) encode soluble N-ethyl-maleimide-sensitive fusion (NSF) attachment protein (SNAP) gamma enabling the latter to bind to target membranes.[13],[14],[15] There are different isoforms of these families involved in specific tissues. For example, alpha NAPG is known to be brain specific isoform of the SNAP family.[16] As NSF and SNAP are often associated with intracellular membrane fusion apparatus, their action is controlled by SNAP receptors particular to the fused membranes.[17],[18] However, the NAPG product mediates platelet exocytosis and also controls the membrane fusion events of this process and therefore is required between endoplasmic reticulum and the Golgi apparatus for vesicular transport.[19] About 26,894 bases, it is located on chromosome 18p11.22 and has 312 amino acids and weighs about 34746 Daltons. Its subunit bind to RIP11 and with VIT11A by similarity. In addition to interacting with NSF, it interacts with RAB11A, STX8, ELAVL1 and a host of proteins essential for vesicular transportation. NAPG is involved in bipolar disorder, where the human NAPG gene polymorphisms pose the possible risk factors for the development of the disorder.[20] The genotype results indicate three NAPG single nucleotide polymorphisms (SNP) showing significant association with the disorder and further playing a role in the neurotransmission cellular processes in the central nervous system. However, its role in causing lung cancer is yet to be elucidated.

Piezo type mechanosensitive ion channel component 2 (PIEZO2) is a huge transmembrane protein made of 478,524 bases. With 2752 amino acids, weighing 318064 Da, it is located on Chromosome 18p11.22 and has a conserved domain of domain of unknown function (DUF) family and 24-36 predicted transmembrane domains. The latter are seen to be conserved among several species with synonyms, viz. Fam38B (Family with Sequence similarity 38 B), FLJ23403, C18orf58, etc. In somatosensory neurons, PIEZO2 plays a role in rapidly adapting mechanically activated (MA) cationic currents.[21] Evidences of strongest expression in lung, bladder and colon and weaker expression in stomach, small intestine, skin and kidney were shown in adult mouse tissues through Quantitative PCR even as potential role in somatosensory mechanotransduction was witnessed by the abundant expression in dorsal root ganglia sensory neurons.[22] PIEZO1 and PIEZO2 share protein domains through PIEZO2 as the latter happens to have four isoforms.

KRAS, an oncogene was discovered as the transforming gene coded by acutely transforming retroviruses. First discovered by Harvey and Kristen rat sarcoma viruses, these carry homologous genes in the form of v-H-ras and v-Kras and their highly conserved peers encode 21 KDa proteins that are similar to the structure to the signaling proteins known as G-proteins.[23] Somatic mutations have been found in ras genes in human lung cancer (non-small cell lung cancers [NSCLCs]). Whereas human tumors exhibit mutations only in exon 12 and 61, KRAS gene is alternatively spliced into two isoforms, viz. KRAS A and KRAS B.[24]

EGFR is necessary for cell proliferation. Located on chromosome 7p12 with 28 exons, it is a 170 KDa membrane bound protein consists of three domains namely an extracellular domain, a transmembrane domain and a cytosolic domain.[25] The receptors exist as inactive monomers; upon binding of ligand such as epidermal growth factor (EGF)/transforming growth factor (TGF) alpha, the receptors undergo conformational changes that facilitate homo or heterodimerization. Apart from being one of the most commonly mutated genes in lung cancers, it is one of the members of receptor tyrosine kinase family of growth factor receptors, viz. HER1/erb-b1/EGFR, HER2/erb-b2/neu, HER3/erb-b3 and HER4/erb-b4 4]. The EGFR activated pathways include Akt, STAT and mitogen-activated protein kinases that induce cell proliferation. The EGFR induces cancer via at least three mechanisms, viz. over expression of EGFR ligands, amplification of EGFR and mutational activation of EGFR.[26] Furthermore, many other genes such as Mesenchymal-epithelial transition (MET), a proto-oncogene encodes a receptor tyrosine kinase wherein heterodimerization results in trans-phosphorylation of receptor tyrosine kinases (RTKs in a manner dependent on the kinase activity of MET.[27]

  Mutations Harbored in EGFR: With Phosphorylation, EGFR Activates Receptor Kinases Top

Of all the genes discussed so far, K-Ras and EGFR mutations are mutually exclusive in NSCLCs. Whereas, EGFR mutations are characteristic of tumors arising in non-smokers, K-Ras mutations are associated in smoking related cancers. EGFR is the most commonly mutated protein that results in the lung cancer. An approximate 90% of lung cancers specific to EGFR mutations are either Leucine to Arginine at 858(L858R) or deletion in exon 19 that affect conserved sequence LREA (delE746-A750).[28] These mutations however activate tyrosine kinase activity of EGFR by destabilizing its auto-inhibited conformation, which is normally maintained in the absence of ligand stimulation. Riely et al. reviewed the mutations in exons encoding EGFR kinase domain (18-21):[29]

  • Mutations resulting in drug sensitivity:
    • Point mutations in exon 18 (G719A/C)
    • Point mutations in exon 21 (L858R and L861Q)
    • In-frame deletions in exon 19 (eliminates LREA downstream to Lys745 residue)
  • Kinase domain mutations resulting in drug resistance:
    • Exon 19 point mutation D761Y
    • Exon 20 point mutation T790M
    • Exon 20 insertion (D770-N771 ins NPG)
  • Mutations outside the exon are rare: EGFR VIII mutation
  • Germline mutations were also seen linked to exon 19 deletion and L858R mutations resulting in ligand-independent activation and prolonged receptor kinase activity after ligand stimulation.

Phosphorylation is the key event in activation of receptor tyrosine kinases. Phosphorylation of tyrosine residues in the tyrosine kinase domain of EGFR recruits downstream signaling proteins that binds to phosphor tyrosine residues and activates the receptor. Gotoh et al. examined whether or not autophosphorylation of EGFR correlates with capacity of the activated EGFR to induce cell growth and transformation.[30] The human EGFR, when truncated after residue 1011 resulted in the removal of all three major autophosphorylation sites. While activation and receptor autophosphorylation are interrelated, Zhang et al. showed how through mass spectroscopic approaches, phosphorylation sites were mapped while EGFR related to oncogenic mutations exhibited tyrosine kinase sensitivity.[31] Further works by A.M. Honegger on induced intermolecular autophosphorylation of EGFR in living cells have revealed in vitro identification of a kinase with a point mutation (K721A) resulting in loss of kinase activity.[32] While this phosphorylation pattern was found to be similar to wild type, all substrate sites are accessible to inter molecular phosphorylation. Several works on tyrosine phosphorylation of EGFR and K-Ras proteins have been shown through unbiased phosphor-proteomic approaches, which concluded that phosphorylation occurs in the specific context and recruit signaling molecules with SH2 domain.[33]

  Bioinformatic Analysis Reveal That the Five Proteins Involved in Lung Cancers Are Localized to Different Organelles Top

Although human genome project has steadfastly determined the sequence of chemical base pairs, sequencing techniques have been enriched with the plethora of improved methods. Sanger's method and Maxam-Gilbert method use radioactively labeled sets of DNA fragments obtained by four reactions and the DNA fragments are separated based on the size by running on adjacent lanes on polyacrylamide gel, detected by autoradiography. Recently, capillary gel electrophoresis has been used to separate radio actively labeled DNA fragments. A single instrument can generate 1 to 2 million bases per day. Whereas Sanger's method can be used for small scale projects, they have greater granularity when compared with new techniques. The great limitation of Sanger's method for larger sequence output were needed for gels or polymers to separate fluorescently labeled DNA fragments, with the relatively low number of samples that can be analyzed in parallel and the difficulty of total automation of sample preparation methods. These limitations led to the development of techniques such as the next generation sequencing (NGS) without gels and those that can analyze large data in parallel [Figure 1].[34]
Figure 1: An overview of next generation sequencing techniques. Courtesy:

Click here to view

Rapid development of sequencing method has yielded low-cost high throughput techniques with several platforms. In the Illumina (Solexa) Genome Analyzer, the DNA fragments are ligated at both ends to adapters and after denaturation immobilized on one end of solid support. The surface of support is coated densely with the adapters and the complementary adapters. Each single-stranded fragment, immobilized at one end on the surface, creates a “bridge” structure by hybridizing with its free end to the complementary adapter on the surface of the support. In the mixture containing the PCR amplification reagents, the adapters on the surface act as primers for the following PCR amplification. Again, amplification is needed to obtain sufficient light signal intensity for reliable detection of the added bases. The system generates at least 1.5 GB of single-read data per run, at least 3 GB of data in a paired-end run, recording data from more than 50 million reads per flow cell. The run time for a 36-cycle run was decreased to 2 days for a single-read run and 4 days for a paired-end run. While in the Applied Biosystems ABI SOLiD system, DNA fragments are ligated to adapters then bound to beads such that only one fragment bound per bead in a water droplet in oil emulsion containing the amplification reagents. DNA fragments on the beads were amplified by the emulsion PCR. After DNA denaturation, the beads are deposited onto a glass support surface and a primer hybridized to the adapter. Then, a mixture of oligonucleotide octamers is also hybridized to the DNA fragments and ligation mixture added. In these octamers, the doublet of fourth and fifth bases is characterized by one of four fluorescent labels at the end of the octamer. After the detection of the fluorescence from the label, bases 4 and 5 in the sequence are thus determined. The ligated octamer oligonucleotides are cleaved off after the fifth base, removing the fluorescent label, then hybridization and ligation cycles are repeated. By this technique, sequences can be determined in parallel for more than 50 million bead clusters.

Sequence analysis using “Blastp”[35] revealed that EGFR having 100% sequence similarity to its variant form A, EGF isoform b precursor, Chain A of the extracellular domain of EGFR in complex with adnectin and Chain A of the extracellular region of EGFR in complex with the Fab fragment of Imc-11f8. The other hits of interest with close identity are cell growth inhibiting protein 40, A431 specific P115 EGFR, EGF isoform d precursor and chain A of the extracellular domain of EGFR in complex with Fab fragment of cetuximab/erbitux/imc C-225. KRAS was found to have significant similarities with a host of proteins like UBE2 L3/KRAS fusion protein and KRAS viral oncogene homolog and GTPase Kras. Closer the similarity of a sequence, higher could be the probability of their role in lung cancer. The localization studies showed the genes are not specific to an organelle as shown in [Table 2].
Table 2: Sub-cellular localization of important genes involved in lung cancer

Click here to view

As conserved domains play a very important role in function of proteins, we further exploited them using NCBI conserved domains search (CDS).[36] The result was obtained as specific and non-specific hits that include the list of domain hits from various sequences. To validate these results, a confirmative analysis was done using ProDom.[37] The EGFR protein showed two receptor L domains, four furin like repeats, a transmembrane domain and a protein kinase domain, which are located in various regions of the sequence even as the ProDom tool also predicted the regions of domains, which are overlapping with regions predicted by CDS [Figure 2].
Figure 2: NCBI conserved domains search for all the five important genes reviewed

Click here to view

The domain analysis for Kras gave two domains (1) H_N_K_ras like domain, which is located in the region 3-164 and (2) GTPase SAR1 and related small G proteins located at region 1-185. In case of NAPG, there are no conserved domains, but it does have a tetratricopeptide (TPR)–like helical region (IPR011990), which happens to be a structural motif that is conserved in a wide range of proteins. Proteins containing TPRs mediate protein-protein interactions (PPIs) and the assembly of multiprotein complexes is involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding. The PIEZO2 showed one conserved domains by way of protein of unknown function (DUF3595) family wherein the protein family is functionally uncharacterized. They are typically between 578 and 2525 amino acids in length and are found in eukaryotes and phosphopentethiene site from amino acids 105-120. In case of APCDD1, there are no putative conserved domains identified.

When mutational analysis was determined using a mutational database (MutDB),[38] the loss or gain of function due to mutation was observed with mutations in EGFR and KRAS, resulting in lung cancer. These mutations were further confirmed by performing similar analysis using dbSNP of NCBI.[39] The mutations for EGFR, viz. L858R, L861Q, T790M, G719A and G719S resulted in loss or gain of disorder or structural changes that were analyzed by MutPred tool.[40] Only few mutations in KRAS are known To-date at positions 12, 13 and 61. The point mutations predicted by MutDB and MutPred were G12C, G12D, G12R, G12V and G12S. The most significant mutation in APCDD1 is a missense mutation at the 9th position (L9R) causing hereditary hypertrichosis and V149I corresponding to gain of helix while all the remaining were consistent SNPs. Further identification and analysis of phosphorylation sites was done only for EGFR as there are evidences that hyperphosphorylation of EGFR tyrosine kinase domain may result in lung cancer. EGFR belongs to a receptor tyrosine kinase family. The phosphorylation site prediction using 'NetPhosK and NetPhos 2.0 server' [41],[42] revealed that KRAS, NAPG, PEZO2 and APCDD1 showed no significant evidence of phosphorylation playing a role in lung cancer.

  PPI Studies And Validation With NGS Top

The PPI studies reveal candidate proteins involved in specific pathways or the ones similar to functions. When “Osprey and Genemania”[43],[44] were used to list interactants specific to lung cancer, a host of proteins were found to have significant connections with each other [Figure 3]. Only interactions using two or three intermediates were selected for analysis in human grid. There were seven interactions between EGFR and KRAS involving two intermediates of which PDGFRB, JAK2, SRC, AR and CDC25A were also directly interacting with EGFR. Further the PPI between EGFR, KRAS, APCDD1, NAPG and PIEZO2 yielding interactions were searched and visualized in three grids, viz. Homo sapiens, Mus musculus and Rattus norvegicus. In humans, all the five proteins were shown to interact mostly through physical interactions. Although there are other types of interactions existing between five of them, only a few reliable intermediate candidates exist between them. APCDD1 is known to coexpress with RAB11FIP5 which physically interacts with NAPG. Proteins that were interacting with EGFR were ERBB2, EGF, TGFA, GRB2, SOCS1, RASA1, SHC3, VAV3, EREG, SHC1, MET and SRC. There also exist direct interaction between KRAS and EGFR, further playing a role in causing cancer. Further, the annotated data were subjected to quality check and manipulation, mapping, quality score box plots and Indel analysis using the NGS on “Galaxy”[45] and “DNAnexus”[46] on cloud were performed. We obtained five lung cancer datasets from SRA of NCBI database [Table 3].
Figure 3: Protein-Protein Interactions of epidermal growth factor receptor, Kirsten rat sarcoma, adenomatosis polyposis coli down-regulated 1,N-ethylmaleimide-sensitive factor attachment protein, gamma and piezo-type mechanosensitive ion channel component 2 in Homo sapiens

Click here to view
Table 3: NCBI-sequence read archives containing the essential lung cancer datasets

Click here to view

The FASTQ files obtained from the annotation were downloaded and only the smoker with lung cancer (SRX060176) dataset was considered as a step to check whether or not any of the five proteins annotated by us fall in proximity with the genome [Figure 4]. We found that the short read lengths, ca. 25-100-125 bases and the size along with assembly of the human genomes make the NGS analysis interesting. However, it would be intriguing to understand the short-read sequence alignment algorithms, which can be compared to short sequence vis a vis to a reference genome. If there are several reads that differ similarly from the reference sequence, the alignments could identify the several variants. Although there are a wide number of alignment algorithms, a simple blastall or legacy blast (multi machine mode) have made it possible to efficiently catalogue the mismatches, alignment length between the queried and the subject sequences. The problems encountered with the current analysis are that the blast could give as many as hits with similarities in sequences in the assembly, but as blast can't reconstruct intron-exon structures, we might preferentially use TopHat or blast like alignment like tool. From the hits, we also observe that the maximum number of mismatches that could appear in the anchored regions of a spliced alignment is 2. Further to circumvent the problem of identifying single copy genes as regions in the contig dataset, we bring down the quantum of candidate sequences using unique significant matches (best match E-value ≤e − 10 and bit score ≥100). We conclude that the accessions, viz. NM_003826.2 and NM_033360.2 could be interesting candidates to work further (, NM_033360.2). Conversely, recent works by Filost et al. on the role of cigarette smoking in activation of EGFR suggested that cigarette smoke induces generation of H2O2.[47] Although exposure to H2O2 causes phosphorylation of EGFR, the phosphorylation sites (Y845 and Y1045) differ from those sites induced by EGF binding. This active EGFR is characterized with impaired trafficking and degradation due to lack of ubiquitination. Ceramide generation under oxidative stress displace cholesterol in membrane rafts, which supports changes in EGFR conformation. Upon exposure to H2O2 active EGFR and active c-Src co-localize with elevated ceramide. C-Src is physically bound to EGFR under H2O2 induced oxidative stress but not by EGF. Inactivation of protein tyrosine phosphatases by H2O2 is responsible for EGFR phosphorylation. Further the generation of hydrogen peroxide is an important mediator required for phosphorylation of EGFR.[48]
Figure 4: Flowchart of the next generation sequence analysis and annotation schema employed

Click here to view

  Conclusions Top

Lung cancer NGS analysis so far has given a much hope for researchers that common diseases can herald the predictions for common diagnostic molecular pathology laboratory. One of the fallacies of these methods, however is that it is time consuming even as systems required for analysis and scientific interpretation needs to be developed, augmented and further validated. Nevertheless, the problems compounded by NGS have higher error rates can be negligibly catered if they delve into thorough data handling process. On hindsight, one can take how many number of the gene associated risks for a specific disease exists and correlate them to a data containing specific population. Through this mini review, we attempted this approach in suggesting evidences and outcome of such approaches. While evidence suggesting EGFR and KRAS contributing to lung cancer is just beginning to be established, whether or not APCDD1, PIEZO2 and NAPG cause lung cancer needs to be determined. What remains to be seen is whether genetic variations in these genes cause lung cancer and the incidence of the same in smokers as well as non-smokers with a comparison between the two genders, the role of common interactants and finally if a suitable drug target can be identified based on the above evidence that has to be obtained. Cigarette smoke exposure may result in ligand independent activation of EGFR. At this stage EGFR exists in the quasi-dimer state which is stabilized by both extracellular and intracellular receptor-receptor interactions and its formation doesn't require extracellular ligand. We reviewed the presence of NAPG, PIEZO2 and APCDD1 along with EGFR and KRAS in the SRA of smoker with lung cancer datasets using NGS analysis. However, the sequences we shortlisted from the review could only serve as the ideal candidates, but whether or not they delve into oncogene expression in human non-small-cell lung carcinomas could be ascertained from indefatigable analysis. Further analysis has to be performed to check whether these are indeed exclusively considered as candidates.

  Acknowledgment Top

We thank Drs. Sabitha Ramanathan and Viji Lakshmi of Cancer Institute, Chennai for providing us the problem to start with.

  References Top

Keith RL, Miller YE. Lung cancer chemoprevention: Current status and future prospects. Nat Rev Clin Oncol 2013;10:334-43.  Back to cited text no. 1
Tsao AS. Lung carcinoma. The Merck manuals, 2013. Available from: [Last cited on 2013 Jul 23].  Back to cited text no. 2
Carbone DP, Minna JD. The molecular genetics of lung cancer. Adv Intern Med 1992;37:153-71.  Back to cited text no. 3
Rose JE, Behm FM, Drgon T, Johnson C, Uhl GR. Personalized smoking cessation: Interactions between nicotine dose, dependence and quit-success genotype score. Mol Med 2010;16:247-53.  Back to cited text no. 4
Jemal A, Siegel R, Ward E, Hao Y, Xu J, Murray T, et al. Cancer statistics, 2008. CA Cancer J Clin 2008;58:71-96.  Back to cited text no. 5
Behera D, Balamugesh T. Lung cancer in India. Indian J Chest Dis Allied Sci 2004;46:269-81.  Back to cited text no. 6
Blons H, Pallier K, Le Corre D, Danel C, Tremblay-Gravel M, Houdayer C, et al. Genome wide SNP comparative analysis between EGFR and KRAS mutated NSCLC and characterization of two models of oncogenic cooperation in non-small cell lung carcinoma. BMC Med Genomics 2008;1:25.  Back to cited text no. 7
Shimomura Y, Agalliu D, Vonica A, Luria V, Wajid M, Baumer A, et al. APCDD1 is a novel Wnt inhibitor mutated in hereditary hypotrichosis simplex. Nature 2010;464:1043-7.  Back to cited text no. 8
Kikuchi A. Tumor formation by genetic mutations in the components of the Wnt signaling pathway. Cancer Sci 2003;94:225-9.  Back to cited text no. 9
Akiyama T. Wnt/beta-catenin signaling. Cytokine Growth Factor Rev 2000;11:273-82.  Back to cited text no. 10
Takahashi M, Fujita M, Furukawa Y, Hamamoto R, Shimokawa T, Miwa N, et al. Isolation of a novel human gene, APCDD1, as a direct target of the beta-Catenin/T-cell factor 4 complex with probable involvement in colorectal carcinogenesis. Cancer Res 2002;62:5651-6.  Back to cited text no. 11
Baumer A, Belli S, Trüeb RM, Schinzel A. An autosomal dominant form of hereditary hypotrichosis simplex maps to 18p11.32-p11.23 in an Italian family. Eur J Hum Genet 2000;8:443-8.  Back to cited text no. 12
Mastick CC, Falick AL. Association of N-ethylmaleimide sensitive fusion (NSF) protein and soluble NSF attachment proteins-alpha and -gamma with glucose transporter-4-containing vesicles in primary rat adipocytes. Endocrinology 1997;138:2391-7.  Back to cited text no. 13
Chen D, Xu W, He P, Medrano EE, Whiteheart SW. Gaf-1, a gamma -SNAP-binding protein associated with the mitochondria. J Biol Chem 2001;276:13127-35.  Back to cited text no. 14
Kawase K, Shibata M, Kawashima H, Hatsuzawa K, Nagahama M, Tagaya M, et al. Gaf-1b is an alternative splice variant of Gaf-1/Rip11. Biochem Biophys Res Commun 2003;303:1042-6.  Back to cited text no. 15
Whiteheart SW, Griff IC, Brunner M, Clary DO, Mayer T, Buhrow SA, et al. SNAP family of NSF attachment proteins includes a brain-specific isoform. Nature 1993;362:353-5.  Back to cited text no. 16
Whiteheart SW, Brunner M, Wilson DW, Wiedmann M, Rothman JE. Soluble N-ethylmaleimide-sensitive fusion attachment proteins (SNAPs) bind to a multi-SNAP receptor complex in Golgi membranes. J Biol Chem 1992;267:12239-43.  Back to cited text no. 17
Wilson DW, Whiteheart SW, Wiedmann M, Brunner M, Rothman JE. A multisubunit particle implicated in membrane fusion. J Cell Biol 1992;117:531-8.  Back to cited text no. 18
Lemons PP, Chen D, Bernstein AM, Bennett MK, Whiteheart SW. Regulated secretion in platelets: Identification of elements of the platelet exocytosis machinery. Blood 1997;90:1490-500.  Back to cited text no. 19
Li X, Zhang J, Wang Y, Ji J, Yang F, Wan C, et al. Association study on the NAPG gene and bipolar disorder in the Chinese Han population. Neurosci Lett 2009;457:159-62.  Back to cited text no. 20
Coste B, Mathur J, Schmidt M, Earley TJ, Ranade S, Petrus MJ, et al. Piezo1 and Piezo2 are essential components of distinct mechanically activated cation channels. Science 2010;330:55-60.  Back to cited text no. 21
Dubin AE, Schmidt M, Mathur J, Petrus MJ, Xiao B, Coste B, et al. Inflammatory signals enhance piezo2-mediated mechanosensitive currents. Cell Rep 2012;2:511-7.  Back to cited text no. 22
Kirsten WH, Mayer LA. Morphologic responses to a murine erythroblastosis virus. J Natl Cancer Inst 1967;39:311-35.  Back to cited text no. 23
Macaluso M, Russo G, Cinti C, Bazan V, Gebbia N, Russo A. Ras family genes: An interesting link between cell cycle and cancer. J Cell Physiol 2002;192:125-30.  Back to cited text no. 24
Gazdar AF. Epidermal growth factor receptor inhibition in lung cancer: The evolving role of individualized therapy. Cancer Metastasis Rev 2010;29:37-48.  Back to cited text no. 25
Pao W, Miller VA. Epidermal growth factor receptor mutations, small-molecule kinase inhibitors, and non-small-cell lung cancer: Current knowledge and future directions. J Clin Oncol 2005;23:2556-68.  Back to cited text no. 26
Tanizaki J, Okamoto I, Sakai K, Nakagawa K. Differential roles of trans-phosphorylated EGFR, HER2, HER3, and RET as heterodimerisation partners of MET in lung cancer with MET amplification. Br J Cancer 2011;105:807-13.  Back to cited text no. 27
Vijayalakshmi R, Krishnamurthy A. Targetable “driver” mutations in non small cell lung cancer. Indian J Surg Oncol 2011;2:178-88.  Back to cited text no. 28
Riely GJ, Politi KA, Miller VA, Pao W. Update on epidermal growth factor receptor mutations in non-small cell lung cancer. Clin Cancer Res 2006;12:7232-41.  Back to cited text no. 29
Gotoh N, Tojo A, Muroya K, Hashimoto Y, Hattori S, Nakamura S, et al. Epidermal growth factor-receptor mutant lacking the autophosphorylation sites induces phosphorylation of Shc protein and Shc-Grb2/ASH association and retains mitogenic activity. Proc Natl Acad Sci U S A 1994;91:167-71.  Back to cited text no. 30
Zhang G, Fang B, Liu RZ, Lin H, Kinose F, Bai Y, et al. Mass spectrometry mapping of epidermal growth factor receptor phosphorylation related to oncogenic mutations and tyrosine kinase inhibitor sensitivity. J Proteome Res 2011;10:305-19.  Back to cited text no. 31
Honegger AM, Schmidt A, Ullrich A, Schlessinger J. Evidence for epidermal growth factor (EGF)-induced intermolecular autophosphorylation of the EGF receptors in living cells. Mol Cell Biol 1990;10:4035-44.  Back to cited text no. 32
Guha U, Chaerkady R, Marimuthu A, Patterson AS, Kashyap MK, Harsha HC, et al. Comparisons of tyrosine phosphorylated proteins in cells expressing lung cancer-specific alleles of EGFR and KRAS. Proc Natl Acad Sci U S A 2008;105:14112-7.  Back to cited text no. 33
França LT, Carrilho E, Kist TB. A review of DNA sequencing techniques. Q Rev Biophys 2002;35:169-200.  Back to cited text no. 34
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol 1990;215:403-10.  Back to cited text no. 35
Marchler-Bauer A, Zheng C, Chitsaz F, Derbyshire MK, Geer LY, Geer RC, et al. CDD: Conserved domains and protein three-dimensional structure. Nucleic Acids Res 2013;41:D348-52.  Back to cited text no. 36
Corpet F, Servant F, Gouzy J, Kahn D. ProDom and ProDom-CG: Tools for protein domain analysis and whole genome comparisons. Nucleic Acids Res 2000;28:267-9.  Back to cited text no. 37
Mooney SD, Altman RB. MutDB: Annotating human variation with functionally relevant data. Bioinformatics 2003;19:1858-60.  Back to cited text no. 38
Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. dbSNP: The NCBI database of genetic variation. Nucleic Acids Res 2001;29:308-11.  Back to cited text no. 39
Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN, et al. Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics 2009;25:2744-50.  Back to cited text no. 40
Blom N, Sicheritz-Pontén T, Gupta R, Gammeltoft S, Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 2004;4:1633-49.  Back to cited text no. 41
Blom N, Gammeltoft S, Brunak S. Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 1999;294:1351-62.  Back to cited text no. 42
Breitkreutz BJ, Stark C, Tyers M. Osprey: A network visualization system. Genome Biol 2003;4:R22.  Back to cited text no. 43
Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, et al. The GeneMANIA prediction server: Biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 2010;38:W214-20.  Back to cited text no. 44
Goecks J, Nekrutenko A, Taylor J, Galaxy Team. Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 2010;11:R86.  Back to cited text no. 45
DNAnexus. Mountain View (CA): DNAnexus, Inc.; 2013. Available from: [Last cited on 2013 Jul 23].  Back to cited text no. 46
Filosto S, Khan EM, Tognon E, Becker C, Ashfaq M, Ravid T, et al. EGF receptor exposed to oxidative stress acquires abnormal phosphorylation and aberrant activated conformation that impairs canonical dimerization. PLoS One 2011;6:e23240.  Back to cited text no. 47
Meves A, Stock SN, Beyerle A, Pittelkow MR, Peus D. H(2)O(2) mediates oxidative stress-induced epidermal growth factor receptor phosphorylation. Toxicol Lett 2001;122:205-14.  Back to cited text no. 48


  [Figure 1], [Figure 2], [Figure 3], [Figure 4]

  [Table 1], [Table 2], [Table 3]

This article has been cited by
1 Loss of stretch-activated channels, PIEZOs, accelerates non-small cell lung cancer progression and cell migration
Zhicheng Huang, Zhiqiang Sun, Xueying Zhang, Kai Niu, Ying Wang, Jun Zheng, Hang Li, Ying Liu
Bioscience Reports. 2019; 39(3)
[Pubmed] | [DOI]
2 A Tale of Two States: Normal and Transformed, With and Without Rigidity Sensing
Michael Sheetz
Annual Review of Cell and Developmental Biology. 2019; 35(1): 169
[Pubmed] | [DOI]
3 Fundamental structural and functional properties of Aquaporin ion channels found across the kingdoms of life
Mohamad Kourghi,Jinxin V. Pei,Michael L. De Ieso,Saeed Nourmohammadi,Pak Hin Chow,Andrea J. Yool
Clinical and Experimental Pharmacology and Physiology. 2018; 45(4): 401
[Pubmed] | [DOI]


Print this article  Email this article


  Site Map | What's new | Copyright and Disclaimer | Privacy Notice
  Online since 1st April '07
  2007 - Indian Journal of Cancer | Published by Wolters Kluwer - Medknow