We performed a series of bioinformatics analysis on a set of important gene expression data with 76 samples in early stage of non-small cell lung cancer, including 40 adenocarcinoma samples, 16 squamous cell carcinoma samples and 20 normal samples. In order to identify the specific markers for diagnosis, we compared the two subtypes with the normal samples respectively to determine the gene expression characteristics. Through the multi-dimensional scaling classification, we found that the samples were clustered well according to the disease cases. Based on the classification results and using empirical Bayes moderation and treat method, 486 important genes associated with the disease were identified. We constructed gene functions and gene pathways to verify our result and explain the pathogenicity factor and process. We generated a protein-protein interaction network based on the mutual interaction between the selected genes and found that the top thirteen hub genes were highly associated with lung cancer or some other cancers including five newly found genes through our method. The results of this study indicated that contrast on the gene expression between different subtypes and normal samples provides important information for the detection of non-small cell lung cancer and helps exploration of the disease pathogenesis.