Artificial Intelligence or AI complete notes | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: Artificial Intelligence or AI complete notes
Description: These notes have full theory knowledge as well as a few examples and algorithms of AI for upcoming data scientists, AI engineers and any interested person from computer science domain. the codes are python based and if needed, i can provide detailed explained codes for all. Anybody can understand these as they are very easy to understand and well explained.

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

Unit 1

What is Artificial Intelligence?
At its simplest form, artificial intelligence is a field, which combines computer science and robust
datasets, to enable problem-solving
...
These disciplines
are comprised of AI algorithms which seek to create expert systems which make predictions or
classifications based on input data
...

Strong AI is made up of Artificial General Intelligence (AGI) and Artificial Super Intelligence (ASI)
...

History of Artificial Intelligence
•

1950: Alan Turing publishes Computing Machinery and Intelligence
...
The
value of the Turing test has been debated ever since
...
(McCarthy would go on to invent the Lisp language
...
C
...

•

1967: Frank Rosenblatt builds the Mark 1 Perceptron, the first computer based on a neural
network that 'learned' though trial and error
...

•

1980s: Neural networks which use a backpropagation algorithm to train itself become widely
used in AI applications
...

•

2011: IBM Watson beats champions Ken Jennings and Brad Rutter at Jeopardy!

•

2015: Baidu's Minwa supercomputer uses a special kind of deep neural network called a
convolutional neural network to identify and categorize images with a higher rate of accuracy
than the average human
...
The victory is significant given the huge
number of possible moves as the game progresses (over 14
...

Later, Google purchased DeepMind for a reported USD 400 million
...
Speech recognition: It is also known as automatic speech recognition (ASR), computer
speech recognition, or speech-to-text, and it is a capability which uses natural language
processing (NLP) to process human speech into a written format
...
g
...

2
...
They answer frequently asked questions (FAQs) around topics, like shipping, or
provide personalized advice, cross-selling products or suggesting sizes for users, changing the
way we think about customer engagement across websites and social media platforms
...

3
...
This ability to provide recommendations distinguishes it from image
recognition tasks
...

4
...
This
is used to make relevant add-on recommendations to customers during the checkout process
for online retailers
...
Automated stock trading: Designed to optimize stock portfolios, AI-driven high-frequency
trading platforms make thousands or even millions of trades per day without human
intervention
...
It is a collection of nodes that are connected by edges and has a
hierarchical relationship between the nodes
...
Each node can have multiple child nodes, and these child
nodes can also have their own child nodes, forming a recursive structure
...
The vertices are sometimes
also referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph
...
The graph is
denoted by G(E, V)
...

A state is a time snapshot representing some aspect of the problem
...
If an
operation can change one state into another, then the two states are connected in the set of states'
graph
...

The nodes of a state space represent states, and the arcs connecting them represent actions
...

A problem's solution is a node in the graph representing all possible states of the problem
...

State Space Representation consists of identifying an INITIAL STATE (from where to begin) and a
GOAL STATE (the final destination) and then following a specific sequence of actions (called States)
...

Search Algorithms

Uninformed Search Algorithms
The search algorithms in this section have no additional information on the goal node other than the
one provided in the problem definition
...
Uninformed search is also called Blind search
...
They are:
•
•
•

Depth First Search
Breadth First Search
Uniform Cost Search

Each of these algorithms will have:

•
•
•

A problem graph, containing the start node S and the goal node G
...

A fringe, which is a data structure used to store all the possible states (nodes) that you can
go from the current states
...

A solution plan, which the sequence of nodes from S to G
...
This
information is obtained by something called a heuristic
...
For example – Manhattan distance, Euclidean distance, etc
...
) Different heuristics are used in different informed algorithms discussed below
...
It starts at the tree’s root or graph and searches/visits all nodes at the
current depth level before moving on to the nodes at the next depth level
...

The only catch here is, that, unlike trees, graphs may contain cycles, so we may come to the same
node again
...

A Boolean visited array is used to mark the visited vertices
...
BFS uses a queue data structure for traversal
...

If there is a solution, BFS definitely find it out
...
If there is a solution then BFS is guaranteed to find it
...

Easily programmed
...
Since each level of the tree must be
saved in order to generate the next level and the amount of memory is proportional to the
number of nodes stored the space complexity of BFS is O(bd )
...

Depth First Search

Depth-first search is an algorithm for traversing or searching tree or graph data structures
...

So, the basic idea is to start from the root or any arbitrary node and mark the node and move to
the adjacent unmarked node and continue this loop until there is no unmarked adjacent node
...
Finally, print the nodes in the
path
...
This is in contrast with
breadth-first search which requires more space
...

The time complexity of a depth-first Search to depth d is O(bd) since it generates the same
set of nodes as breadth-first search, but simply in a different order
...

If depth-first search finds solution without exploring much in a path then the time and space
it takes will be very less
...
By chance
DFS may find a solution without examining much of the search space at all
...
Even a finite graph can generate an infinite tree One solution to this
problem is to impose a cutoff depth on the search
...
If the
chosen cutoff depth is less than d, the algorithm will fail to find a solution, whereas if the
cutoff depth is greater than d, a large price is paid in execution time, and the first solution
found may not be an optimal one
...

And there is no guarantee to find a minimal solution, if more than one solution
...
In this algorithm from the starting state, we will
visit the adjacent states and will choose the least costly state then we will choose the next least
costly state from the all un-visited and adjacent states of the visited states, in this way we will try to
reach the goal state (note we won’t continue the path through a goal state), even if we reach the
goal state we will continue searching for other possible paths (if there are multiple goals)
...

Best First Search
Greedy best-first search algorithm always selects the path which appears best at that moment
...
It uses the heuristic
function and search
...
With the
help of best-first search, at each step, we can choose the most promising node
...
e
...

This algorithm is more efficient than BFS and DFS algorithms
...

It can get stuck in a loop as DFS
...

A* Search
A* search is the most commonly known form of best-first search
...
It has combined features of UCS and greedy bestfirst search, by which it solve the problem efficiently
...
This search algorithm expands less search tree
and provides optimal result faster
...

In A* search algorithm, we use search heuristic as well as the cost to reach the node
...

f(n) = g(n) + h(n)
Advantages:
•
•
•

A* search algorithm is the best algorithm than other search algorithms
...

This algorithm can solve very complex problems
...

A* search algorithm has some complexity issues
...

Unit 2
Minimax Algorithm
Mini-max algorithm is a recursive or backtracking algorithm which is used in decision-making and
game theory
...
Mini-Max algorithm uses recursion to search through the game-tree
...
Such as Chess, Checkers, tic-tac-toe, go, and various tow-players
game
...
In this algorithm two
players play the game; one is called MAX and other is called MIN
...
Both Players of the
game are opponent of each other, where MAX will select the maximized value and MIN will select
the minimized value
...
The minimax algorithm proceeds all the way down to the
terminal node of the tree, then backtrack the tree as the recursion
...
This type of games has a huge branching factor, and the player has lots of choices to
decide
...

Alpha Beta Pruning
Alpha-beta pruning is a modified version of the minimax algorithm
...
As we have seen in the minimax search algorithm that the number of
game states it has to examine are exponential in depth of the tree
...
Hence there is a technique by which without checking each
node of the game tree we can compute the correct minimax decision, and this technique is called
pruning
...
It is also called as Alpha-Beta Algorithm
...

The two-parameter can be defined as:
•
•

Alpha: The best (highest-value) choice we have found so far at any point along the path of
Maximizer
...

Beta: The best (lowest-value) choice we have found so far at any point along the path of
Minimizer
...

The Alpha-beta pruning to a standard minimax algorithm returns the same move as the standard
algorithm does, but it removes all the nodes which are not really affecting the final decision but
making algorithm slow
...

Condition for Alpha-beta pruning:
α>=β
Key points about alpha-beta pruning:

•

The Max player will only update the value of alpha
...

While backtracking the tree, the node values will be passed to upper nodes instead of values
of alpha and beta
...

Inference Engine
The inference engine is the component of the intelligent system in artificial intelligence, which
applies logical rules to the knowledge base to infer new information from known facts
...
Inference engine commonly proceeds in two modes,
which are:
•
•

Forward chaining
Backward chaining

Horn Clause and Definite clause:
Horn clause and definite clause are the forms of sentences, which enables knowledge base to use a
more restricted and efficient inference algorithm
...

Definite clause: A clause which is a disjunction of literals with exactly one positive literal is known as
a definite clause or strict horn clause
...
Hence all the definite clauses are horn clauses
...
It has only one positive literal k
...

Forward Chaining:
Forward chaining is also known as a forward deduction or forward reasoning method when using an
inference engine
...

The Forward-chaining algorithm starts from known facts, triggers all rules whose premises are
satisfied, and add their conclusion to the known facts
...

Properties of Forward-Chaining:
•
•
•
•

It is a down-up approach, as it moves from bottom to top
...

Forward-chaining approach is also called as data-driven as we reach to the goal using
available data
...

Backward Chaining:

Backward-chaining is also known as a backward deduction or backward reasoning method when
using an inference engine
...

Properties of backward chaining:
•
•
•
•
•
•
•

It is known as a top-down approach
...

In backward chaining, the goal is broken into sub-goal or sub-goals to prove the facts true
...

Backward -chaining algorithm is used in game theory, automated theorem proving tools,
inference engines, proof assistants, and various AI applications
...

Backward-chaining is also known as a backward deduction or backward reasoning method
when using an inference engine
...

Knowledge Base
Knowledge-base is a central component of a knowledge-based agent, it is also known as KB
...
These sentences are expressed in a language which is called a knowledge representation
language
...

Knowledge-base is required for updating knowledge for an agent to learn with experiences and take
action as per the knowledge
...
The description is matched against the conditions of the production rules
...
Actions are designed to alter the contents of working memory
...
The need for such a strategy
arises when the conditions of two or more rules are satisfied by the currently known facts
...
They each have advantages which form
their rationales
...
It is possible to favor either the more general or
the more specific case
...
This usefully catches exceptions and other special cases
before firing the more general (default) rules
...

•

•

•

Not previously used - If a rule's conditions are satisfied, but previously the same rule has
been satisfied by the same facts, ignore the rule
...

Order - Pick the first applicable rule in order of presentation
...

Arbitrary choice - Pick a rule at random
...

Propositional Logic
Propositional logic (PL) is the simplest form of logic where all the statements are made by
propositions
...
It is a technique of
knowledge representation in logical and mathematical form
...

In propositional logic, we use symbolic variables to represent the logic, and we can use any
symbol for a representing a proposition, such A, B, C, P, Q, R, etc
...

Propositional logic consists of an object, relations or function, and logical connectives
...

The propositions and connectives are the basic elements of the propositional logic
...

A proposition formula which is always true is called tautology, and it is also called a valid
sentence
...

A proposition formula which has both true and false values is called
Statements which are questions, commands, or opinions are not propositions such as
"Where is Rohini", "How are you", "What is your name", are not propositions
...
Example:
All the girls are intelligent
...

Propositional logic has limited expressive power
...

Fuzzy Logic
The 'Fuzzy' word means the things that are not clear or are vague
...
At that time, this concept
provides many values between the true and false and gives the flexibility to find the best solution to
that problem
...

It is used for helping the minimization of the logics created by the human
...

It always offers two values, which denote the two possible solutions for a problem and
statement
...

In fuzzy logic, everything is a matter of degree
...

It is based on natural language processing
...

It also allows users to integrate with the programming
...
Fuzzy sets are denoted or
represented by the tilde (~) character
...

Zadeh and Dieter Klaua
...
This theory released as
an extension of classical set theory
...
The
universe of discourse (U) is also denoted by Ω or X
...
6 ), (X2, 0
...
4)}
And, B is a set which contains following elements:
B = {( X1, 0
...
8), (X3, 0), (X4, 0
...
6), (X2, 0
...
9)}

Intersection Operation
The intersection operation of fuzzy set is defined by:
μA∩B(x) = min (μA(x), μB(x))
Example:
Let's suppose A is a set which contains following elements:
A = {( X1, 0
...
7), (X3, 0
...
1)}
And, B is a set which contains following elements:
B = {( X1, 0
...
2), (X3, 0
...
9)}
then,
A∩B = {( X1, 0
...
2), (X3, 0
...
1)}

Complement Operation
The complement operation of fuzzy set is defined by:
μĀ(x) = 1-μA(x),
Example:

Let's suppose A is a set which contains following elements:
A = {( X1, 0
...
8), (X3, 0
...
1)}
then,
Ā= {( X1, 0
...
2), (X3, 0
...
9)}

Complement
Consider a Fuzzy Sets denoted by A , then let’s consider Y be the Complement of it, then for every
member of A , Y will be:
degree_of_membership(Y)= 1 - degree_of_membership(A)

Difference
Consider 2 Fuzzy Sets denoted by A and B, then let’s consider Y be the Intersection of them, then for
every member of A and B, Y will be:
degree_of_membership(Y)= min(degree_of_membership(A), 1- degree_of_membership(B))

Unit 3
Pattern Recognition
Pattern recognition is the process of recognizing patterns by using a machine learning algorithm
...
One of the important
aspects of pattern recognition is its application potential
...

In a typical pattern recognition application, the raw data is processed and converted into a form that
is amenable for a machine to use
...

•

In classification, an appropriate class label is assigned to a pattern based on an abstraction
that is generated using a set of training patterns or domain knowledge
...

•

Clustering generated a partition of the data which helps decision making, the specific
decision-making activity of interest to us
...

Features may be represented as continuous, discrete, or discrete binary variables
...

Example: consider our face then eyes, ears, nose, etc are features of the face
...

Example: In the above example of a face, if all the features (eyes, ears, nose, etc) are taken together
then the sequence is a feature vector([eyes, ears, nose])
...
In the case of speech, MFCC (Mel-frequency
Cepstral Coefficient) is the spectral feature of the speech
...

Pattern recognition possesses the following features:
•
•
•
•
•

Pattern recognition system should recognize familiar patterns quickly and accurate
Recognize and classify unfamiliar objects
Accurately recognize shapes and objects from different angles
Identify patterns and objects even when partly hidden
Recognize patterns quickly with ease, and with automaticity
...
Learning is the most important phase as to how well the system
performs on the data provided to the system depends on which algorithms are used on the data
...
e
...
e
...

Training set:
The training set is used to build a model
...
Training rules and algorithms are used to give relevant information on how to associate input
data with output decisions
...
Generally, 80% of the data
of the dataset is taken for training data
...
It is the set of data that is used to verify whether the
system is producing the correct output after being trained or not
...
Testing data is used to measure the accuracy of the system
...

It is useful for cloth pattern recognition for visually impaired blind people
...

We can recognize particular objects from different angles
...

Sometimes to get better accuracy, a larger dataset is required
...

Example: my face vs my friend’s face
...

Computer vision
Pattern recognition is used to extract meaningful features from given image/video samples and is
used in computer vision for various applications like biological and biomedical imaging
...
Statistical pattern recognition is implemented and
used in different types of seismic analysis models
...

Speech recognition
The greatest success in speech recognition has been obtained using pattern recognition paradigms
...
A number of
recognition methods have been used to perform fingerprint matching out of which pattern
recognition approaches are widely used
...

K-NN algorithm assumes the similarity between the new case/data and available cases and
put the new case into the category that is most similar to the available categories
...
This means when new data appears then it can be easily classified into a well suite
category by using K- NN algorithm
...

K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data
...

KNN algorithm at the training phase just stores the dataset and when it gets new data, then
it classifies that data into a category that is much similar to the new data
...
So for this identification, we can use the KNN
algorithm, as it works on a similarity measure
...

Advantages of KNN Algorithm:
•
•
•

It is simple to implement
...

Disadvantages of KNN Algorithm:
•
•

Always needs to determine the value of K which may be complex some time
...

K-means Algorithm
K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled dataset into
different clusters
...

It allows us to cluster the data into different groups and a convenient way to discover the categories
of groups in the unlabeled dataset on its own without the need for any training
...
The main aim of
this algorithm is to minimize the sum of distances between the data point and their corresponding
clusters
...
The value of k should be
predetermined in this algorithm
...

Assigns each data point to its closest k-center
...

Advantages
•
•
•
•

It is very easy to understand and implement
...

On re-computation of centroids, an instance can change the cluster
...

Disadvantages
•
•
•
•
•

It is a bit difficult to predict the number of clusters i
...
the value of k
...

Order of data will have strong impact on the final output
...
If we will rescale our data by means of normalization or
standardization, then the output will completely change final output
...

Decision Trees
•

•

•
•

Decision Tree is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems
...

In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node
...

The decisions or the test are performed on the basis of features of the given dataset
...

•
•
•

It is called a decision tree because, similar to a tree, it starts with the root node, which
expands on further branches and constructs a tree-like structure
...

A decision tree simply asks a question, and based on the answer (Yes/No), it further split the
tree into subtrees
...

It can be very useful for solving decision-related problems
...

There is less requirement of data cleaning compared to other algorithms
...

It may have an overfitting issue, which can be resolved using the Random Forest algorithm
...

Genetic Algorithm
Genetic Algorithms(GAs) are adaptive heuristic search algorithms that belong to the larger part of
evolutionary algorithms
...

These are intelligent exploitation of random search provided with historical data to direct the search
into the region of better performance in solution space
...

Genetic algorithms simulate the process of natural selection which means those species who can
adapt to changes in their environment are able to survive and reproduce and go to next generation
...
Each generation consist of a population of individuals and each individual
represents a point in search space and possible solution
...
This string is analogous to the Chromosome
...
Following is the foundation of GAs based on this analogy –
1
...
Those individuals who are successful (fittest) then mate to create more offspring than others
3
...

4
...

The population of individuals are maintained within search space
...
Each individual is coded as a finite length vector
(analogous to chromosome) of components
...

Thus a chromosome (individual) is composed of several genes (variable components)
...
The
individual having optimal fitness score (or near optimal) are sought
...
The individuals having better fitness scores are given more chance to reproduce than others
...
The population size is static so the room has to be created for
new arrivals
...
It is hoped that over
successive generations better solutions will arrive while least fit die
...
Thus each new generations have better “partial solutions” than previous generations
...
The algorithm is said to be converged to a set of solutions
for the problem
...

2) Crossover Operator: This represents mating between individuals
...
Then the genes at these crossover
sites are exchanged thus creating a completely new individual (offspring)
...

Fitness score is the number of characters which differ from characters in target string at a particular
index
...

Why use Genetic Algorithms
•
•
•

They are Robust
Provide optimisation over large space state
...
PSO is a Simulation of a
simplified social system
...

In nature, any of the bird’s observable vicinity is limited to some range
...

Let’s mathematically model, the above-mentioned principles to make the swarm find the global
minima of a fitness function
•

Each particle in particle swarm optimization has an associated position, velocity, fitness
value
...

A record of global_bestFitness_position and global_bestFitness_value is maintained
...

Derivative free
...

Very efficient global search algorithm
...

Disadvantages of PSO:
•

Slow convergence in the refined search stage (Weak local search ability)

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview