When thinking about the association rule, answer the following questions this week.
1. What is the association rule in data mining?
2. Why is the association rule especially important in big data analysis?
3. How does the association rule allow for more advanced data interpretation?
The association rule in data mining is a method of discovering interesting relationships between variables in large datasets. It uses the “frequent itemsets” concept, which identifies sets of items that are frequently bought together. The process generally involves analyzing transactions to look for patterns or associations between different combinations of items (Agrawal & Srikant, 1994). For instance, if customers typically purchase bread with butter, then an association rule can be mined from the data to show that these two items have a strong relationship and should be marketed together. Association rules also allow analysts to determine which item combinations have the strongest relationship and those that are more likely to result in sales when offered together (Vijayaraghavan & Menzies, 2009).
The association rule is especially important in big data analysis due to its ability to uncover meaningful patterns within large datasets. Big data analytics often involves analyzing high volumes of structured or unstructured data. By using an association rule approach, analysts can quickly identify meaningful correlations among different item combinations and filter through massive amounts of data more efficiently than traditional methods (Kumar et al., 2018). For example, supermarkets might use association rules to analyze customer purchasing behavior across their stores and identify specific products whose sales increase when they are placed near each other on shelves. By leveraging the power of big data analysis with the association rule approach, companies can gain valuable insights into customer buying habits and optimize their marketing strategies accordingly (Liu et al., 2017).
In conclusion, the association rule is a powerful tool for uncovering relationships within large datasets by identifying frequent itemset combinations amongst different variables. It provides valuable insights about customer purchasing behavior that would otherwise remain hidden in bigger datasets. Its application becomes increasingly important as businesses strive towards utilizing big data analytics for competitive advantage – making it an essential technique in today’s digital age.
The basic aim of the personalized medicine is applying right therapy to the right population of people by defining disease at the moecular level. So, identifying differences among the individuals support the new treatment methods and pharmaceutical companies to develop new cancer drugs. Patients who have similar clinical outcome and histological tumor type can give different response to the same drug(17). Prediction of who will be a nonresponders reduces the harmfull effect of drug on nonresponders like a potential toxic effect of drug and cost effect. Also when drug companies develop new drug, they focus on the patient population that benefit from drug to increase positive responds(17).
U.S. Food and Drug Administration bringed development about targeted therapy. For example, to treat chronic myeloid leukemia and gastrointestinal stromal tumor(18) ,imatinib mesylate is used and to treat breast cancer(19), trastuzumab (Herceptin) is used. Molecular characteristics of these cancer types that are abnormal protein tyrosine kinase activity in chronic myeloid leukemia and gastrointestinal stromal tumor and HER-2 receptor in breastcancer is used as a predictive biomarker. By using these markers only individuals which have these molecular alteration is selected and it means they are favorable for the treatment. Using this way some cancer types’ survival rate is shifted from 0 to 70%(17).
This application is used in non-small cell lung cancer treatment with using of mutations screeing. In this cancer type mutation occurs in kinase domain of EGFR. Gefitinib (Iressa) and erlotinib are tyrosine kinase inhibitors drug are used to treat and patients give a higher response to the treatment(20). Also if patient that is never smoked Asian females have adenocarcinomas, these drugs efficient on them(21). On the other hand, if the mutatuions occur at downstream effector KRAS, patient is resistant to to erlotinib(22). Also mutations that is at KRAS have a resistance to cetuximab (Erbitux) and panitumumab (Vectibix) drugs in colon cancer patients. If the KRAS is wild type, these these drugs is effective on the patients(23). These responses that are specific and different are based on molecular profile. Some molecular test are done before the using of cetuximab or panitumumab to a colon cancer patient. Lung and colon cancer is concerned with targeted therapy that is guide to patient about treatment by understanding the structure of cancer(24).
Pharmacogenomics and treatment safety
Genes that have genetical variation encode enzymes which metobolize drug, drug transporters, or drug targets. Variation in genes that can predict dose and safety of treatment for different types of cancer patient can have harmful influence on these patients’ treatment(25). For instance, polymorphism where in cytochrome P450 enzymes could cause to metabolite to drug slowly or very fast. So patient give an overdose symptoms or no response to drug by changing the pharmacokinetics of drug metabolism, also it may cause an adverse drug reaction(26). Thereby , forecasting optimal dose of drug , inducing the harmful side effects can be provided by using polymorphism(27). In familial breast cancer, patients shows low survival rate to treatment with tamoxifen that is chemotherapeutic drug because of genetic variation in CYP2D6 that is seen as a poor metabolizer (28). There are some studies abour genetic testing on drug label including test for CYP450 polymorphisms.
Prognosis
Insteaf of using clinicopathologic parameters as a biomarker in biochemical testing for prognosis and selection of therapatic way for cancer patient , Genotyping or gene expression profiling by microarray and protein analysis by mass spectrometry is used for prognostic biomarkers with the understanding of the molecular mechanism of cancer subtypes(29).
Biomarkers can be used alone or with combination of other parameters for classify subgroups according to their risk rate and for leading to therapy decision. For example, tissue microarray analysis with combining molecular and clinical biomarker is more efficient than the clasical cl