Search for notes by fellow students, in your own course and all over the country.
Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.
Title: Data mining
Description: Its all about data mining and its application in fraud detection.
Description: Its all about data mining and its application in fraud detection.
Document Preview
Extracts from the notes are below, to see the PDF you'll receive please use the links above
USES OF DATA MINING IN FRAUD DETECTION
Data mining is the process of looking for trends, patterns, and anomalies within a data
set
...
However, unusual transactions those falling outside expected norms
may signal the need for an investigation by forensic accounting investigators who will apply
their experience in data mining and in investigative techniques to the overall situation that
appears to be emerging
...
Some
programs also allow users to add their own metadata to a file, such as a document title, the
subject of the file, the name of the author of the document, the name of the manager responsible
for the document, and the name of the company that owns the document
...
This in turn may lead to additional witnesses, such as the author of a
linked document in a remote location
...
There may be technical, organizational, or legal barriers that
potentially prevent the forensic accounting investigator from gaining access to that data
...
Assessing Data quality and Format
Early assessment of the quality and format of available data is vital to ensure that the
forensic accounting team’s technological resources will be used effectively
...
However, cases in which no tool can
be found to deal with system output are rare, and in most instances at least some of the data is
reliable
...
Additionally, if the data is of a highly specific nature, it
may be necessary at this stage to involve individuals with specialized industry and technical
skills and knowledge
...
Close coordination between the forensic technologists and the organization’s systems
administrator is highly recommended
...
Data can come from
many different sources within an organization, and while financial reporting systems are
frequently the main data sources, less obvious sources may exist from which supplementary
information can be gathered
...
In their search for data, forensic accounting investigators should not focus exclusively on
digital sources
...
For example, bank statements are often scanned and then matched
against the accounting system, creating, in effect, electronic bank reconciliations
...
However, the following are among the most useful and readily
available records:
• Vendor master file
• Employee master file
• Customer master file
• General ledger detail
• Cash disbursements
• Customer invoices
• Other data or data sources, depending on the circumstance, including receiving and
purchasing information, telephone data, voice mail, e-mail, personal digital assistants,
BlackBerrys, and computer hard drives
...
In descending order of
convenience for the forensic accounting team, the formats include database files, delineated text
files, and headed report files
...
Awareness of common pitfalls and mistakes can avoid frustration and erroneous
conclusions
...
The team should establish a protocol for requesting specific information be
combining requests and by speaking to IT and finance/accounting personnel to verify
completeness
...
Requesting a data dictionary can be helpful to an
understanding of what types of information are captured in a system or in a database of
information
...
Another reason the scope of the data specification request should be as broad as possible
has to do with the processing time required for extraction
The specifics of any data request typically depend on the nature of the organization and
its systems and on the nature of the potential fraud
...
The most vital element of any data extract, the primary key is the field that
uniquely identifies each record in the data set
...
These keys are present when the data provided consists of more than one
table
...
• Lookup tables
...
For example, branch codes may exist but not a list of the
specific branch to which each code relates
...
Comparing the number of records received with the number of records that
the client thinks it has extracted is an apparently simple but nonetheless important step
...
With large-scale data extracts, hash totals requested from the client should
be compared with totals computed from the data used in the investigation
...
They provide an
additional check that the data has not been compromised as it passed from the client’s system to
the investigation database
...
The systems administrator should provide a short
description for each field within each table
...
• Extraction specification
...
E-mailing such requests is a perfectly acceptable method of communication
Data Cleaning
This process involves removing page headers and footers from files, expanding data with
more than one row per record, stripping out nonnumeric characters from number fields, and a
host of other procedures aimed at standardizing the data to make it suitable for use with the
analysis tool selected as most appropriate for the data analysis exercise
...
It is the only point at
which alterations are intentionally made to the extracted data, and it is imperative that
amendments made at this stage not affect the accuracy of the information
...
Because of the expense involved
in reviewing such duplicative materials, elimination of duplicates (duplication, or deducing) in
the recovered data sets is often the first order of business after the data has been acquired and the
documentation has been completed
...
We have several data mining applications classes used in fraud detection
...
The categorical labels are predefined,
discrete and unordered
...
Common classification
techniques include neural networks, decision tree and support vector machine
...
Clustering
Clustering is used to divide objects into conceptually meaningful group, with the objects
in a group being similar to one another but very dissimilar to the objects in other group
...
Cluster analysis concerns the problem of decomposing or partitioning a data set
into groups so that the points in one group are similar to another within the same cluster but
dissimilar to those in another cluster
...
In relation to prediction, the attribute for which the values are being predicted is continuousvalued (ordered) rather than categorical (unordered)
...
Outlier detection
Outlier detection is employed to measure the “distance” between data objects to detect
those objects that are grossly different from or inconsistent with the remaining data set
...
A
commonly used technique in outlier detection is the discontinuing learning algorithm
...
The regression
technique is typically undertaken using mathematical methods as logistics regression and linear
regression, and it is used in detection of credit card and automobile insurance, and corporate
fraud
...
We have four main data mining techniques used in fraud detection, namely;
Logistics model
Logistics model is a generalized linear model that is used for binomial regression in
which the predictor variables can either be numerical or categorical
...
It is used in predicting if a
given credit card transaction is fraud or not
...
It is widely applied in classification and clustering and its
advantages are, it is adaptive and it can generate robust models, also the classification process
can be modified if new training weights are set
...
Bayesian belief network
The Bayesian belief network represent a set of random variables and their conditional
independence using a direct acyclic graph, in which nodes represent variables and missing edges
encode conditional independencies between the variables
...
Decision tree
Decision tree are predictive decision support tools that create mapping from observations
to possible consequences
...
Customer /merchant are
suspicious if the mail is fake they are traced all information about the owner/sender through IP
Address
...
Decision Tree is Most
Powerful Technique in Data Mining Decision Tree is vital part of Credit card Fraud Detection
Title: Data mining
Description: Its all about data mining and its application in fraud detection.
Description: Its all about data mining and its application in fraud detection.