Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Associate Analytics
Description: This Facilitators Guidebook for the Associate Analytics program contains detailed facilitation guidelines as well as the exhaustive course material for the Associate Analytics program.

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


Facilitator Guide – SSC/ Q2101 – Associate Analytics

Associate Analytics
FACILITATOR’S GUIDE –MODULE 1

MMODULE 1 1
This Facilitators Guidebook for the
Associate Analytics program contains
detailed facilitation guidelines as well
as the exhaustive course material for
the Associate Analytics program
...
in
W www
...
in

Published by

Building Domain | Enhancing Careers
T: 91 70365 88888
E info@mindmapconsulting
...
mindmapconsulting
...
NASSCOM disclaims all warranties as to the accuracy, completeness or
adequacy of such information
...

Every effort has been made to trace the owners of the copyright material included in
the book
...

No entity in NASSCOM shall be responsible for any loss whatsoever, sustained by any
person who relies on this material
...
No
parts of this report can be reproduced either on paper or electronic media, unless
authorized by NASSCOM
...
1 Introduction to R and R Programming
UNIT 1
...
1 Summarizing Data and Revisiting Probability
UNIT 2
...
0 SQL using R
UNIT 4
...
0 Understanding the Verticals and Requirements Gathering

27
39
59
72
98
104
118

Page 7 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Introduction
Qualifications Pack-Associate – Associate Analytics SSC/Q2101
SECTOR: IT-ITeS
SUB-SECTOR: Business Process Management
OCCUPATION: Analytics
REFERENCE ID: SSC/Q2101
ALIGNED TO NCO CODE: TBD
Brief Job Description: Individuals at this job are responsible for building analytical packages
using Databases, Excel or other Business Intelligence (BI) tools
Personal Attributes: This job requires the individual to follow detailed instructions and
procedures with an eye for detail
...

Eligibility: Bachelor's Degree in Statistics/ Science/Technology, Master's Degree in
Science/Technology/Statistics
Work Experience: 0-1 years of work experience/internship in analytics roles
Roles of Associate Analytics
(--------)

Page 8 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Analytics is a key occupation in the structure of the ITS Sub-Sector

Page 9 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Analytics excellent Vertical and Horizontal movements in their
tracks

Page 10 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Movement to Other Occupations, Sub-sectors and Industries:
Given the dynamic range of services that the BPM sub-sector is increasingly offering to its
clients in the industry, there are a variety of roles that employees are performing across the
entire spectrum of offerings
...


Page 11 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

OVERALL QUALIFICATION PACK DETAILS

Job Details

Qualifications Pack Code
Job Role
Credits(NVEQF/NVQF/NSQF)
Sector
Sub-sector
Occupation
KA1
...


Role Description

KA5
...
1
IT-ITeS
Business Process
Management

Drafted on

Analytics

Next review date

Last reviewed on

30/04/13
30/04/13
30/06/14

KA2
...
Responsible for building analytical packages using
Databases, Excel or other Business Intelligence (BI) tools
KA10
...


KA7
...

KA9
...
Bachelor's Degree in Statistics/ Science/Technology or any
other course
KA12
...
Courses in SPSS, SAS, STATA and/or Spreadsheets
KA13
...
RDBMS concepts, PL\SQL, OCA certification
KA14
...
Financial and accounting terminologies in respective
language & various accounting standards and GAAPs
KA18
...
0-1 years of work experience/internship in analytics roles
KA19
...
Compulsory:
1
...
SSC/ N 2101 (Carry out rule-based statistical analysis)
3
...
SSC/ N 9002 (Work effectively with colleagues)
KA21
...
SSC/ N 9003 (Maintain a healthy, safe and secure working
Occupational Standards (NOS)
environment)
6
...
SSC/ N 9005 (Develop your knowledge, skills and competence)
KA23
...
Optional:
KA25
...
Performance Criteria

KA27
...
The Documentation
types covered would include case studies, best practices, project artifacts, reports, minutes,
policies, procedures, work instructions etc
...


Session Goal
Participants should be able to have a good hands on understanding of MS Word and MS Visio,
where there will be required to draft various documents/reports
...


Session Objectives
Upon completion of both parts of this course, the participants will be able to:
PC1
...
access existing documents, language standards, templates and documentation tools from
your organization’s knowledge base
PC3
...
confirm the content and structure of the documents with appropriate people
PC5
...
review documents with appropriate people and incorporate their inputs
PC7
...
publish documents in agreed formats
PC9
...
comply with your organization’s policies, procedures and guidelines when creating
documents for knowledge sharing
Note: The material for this NOS has been covered in the Associate Analytics Module 3 Book
(book 3) in Unit 5

Page 13 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

SSC/ N 2101 – Carry out rule-based statistical analysis
Session Overview
In the Associate Analytics Carry out rule based statistical analysis, the participants will go
through Business Analytics using R tool
...
Furthermore, they will also
have an overview of Big Data tools and their basic functioning
...
Finally the participants will learn about Data Visualization and gather
knowledge on Graphical representation of Data as well as results and reports
...
Then also learn about Big Data tools and Big Data Analytics
...


Session Objectives
To be competent, participants must be able to:
PC1
...
obtain guidance from appropriate people to identify suitable data sources to agree the
methodological approach
PC3
...
validate data accurately and identify anomalies
PC5
...
carry out rule-based analysis of the data in line with the analysis plan
PC7
...
review the results of your analysis with appropriate people
PC9
...
draw justifiable inferences from your analysis
PC11
...
comply with your organization’s policies, procedures and guidelines when carrying out
rule-based quantitative analysis
Note: The material for this NOS has been covered in all the three Modules of Associate
Analytics
Page 14 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

SSC/ N 9001: Manage Your Work to Meet
Requirement
Session Overview
The Associate Analytics Manage your work to meet requirement module is designed to help
participants understand the importance of time in a professional environment and how to manage
multiple time bound requirements
...

Participants learn how to manage work and how to ensure deliverables are completed in
stipulated time in an organization by following tested principles to prevent/handle slippages
on timelines
...

Time management cannot override the qualitative aspect of the deliverable
...
The requirements of a work unit may be further classified
into; activities, deliverable, quantity, standards and timelines
...

Additionally, this session discusses practical application of planning and execution of work
plans to enable the participants to effectively deal with the failure points, minimize the
impact, if any
...

Successful candidates will be able to understand the inter-relationship of time, effort,
impact and cost
...
Establish and agree your work requirements with appropriate people
PC2
...
Utilize your time effectively
PC4
...
Treat confidential information correctly
PC6
...
Work within the limits of your job role
PC8
...
Ensure your work meets the agreed requirements
Note: The material for this NOS has been covered in Unit 1 of Module 1
...
It
emphasizes on how relationship management is critical to work management
...

Participants learn how to manage cross functional relationships and how to nurture a good
working environment
...


Session Goal
The primary goal of the session is for the participants to understand the importance of
professional relationships with colleagues
...

Successful candidates will be able to understand the inter-relationship of professionalism and
team-work
...
Communicate with colleagues clearly, concisely and accurately
...
Work with colleagues to integrate your work effectively with theirs
...
Pass on essential information to colleagues in line with organizational requirements
...
Work in ways that show respect for colleagues
...
Carry out commitments you have made to colleagues
...
Let colleagues know in good time if you cannot carry out your commitments, explaining
the reasons
...
Identify any problems you have working with colleagues and take the initiative to solve
these problems
...
Follow the organization’s policies and procedures for working with colleagues
...
Much of the material
herein is going to be self-study for the participants

Page 16 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

SSC/ N 9003: Maintain a Healthy, Safe and Secure
working Environment
Session Overview
The Associate Analytics Health, Safety and Security module is designed to help participants
understand the importance of following safety rules and regulations at workplace
...
The module also emphasizes the need of
security and the entities that can pose a threat to it
...

Additionally, this session discusses practical application of the health and safety procedures to
enable the participants to effectively deal with the hazardous events to minimize the impact, if
any
...
Comply with your organization’s current health, safety and security policies and procedures
PC2
...
Identify and correct any hazards that you can deal with safely, competently and within the
limits of your authority
PC4
...
Follow your organization’s emergency procedures promptly, calmly, and efficiently
PC6
...
Complete any health and safety records legibly and accurately
Note: The material for this NOS has been covered in Unit 2 of Module 1
...
This module is aimed at developing the sense of understanding in an individual when
the individual works with data, of how to take the data and present it as relevant information
in standardized formats
...


Session Goal
The primary goal of the session is for the participants to analyze data and present it in a
suitable format, as is suitable for the given process or organization
...


Session Objectives
Upon completion of both parts of this course, the participants will be able to:
PC1
...
obtain the data/information from reliable sources
PC3
...
obtain advice or guidance from appropriate people where there are problems with the
data/information
PC5
...
insert the data/information into the agreed formats
PC7
...
report any unresolved anomalies in the data/information to appropriate people
PC9
...
It emphasizes on how enhance skills and knowledge
in a diversified professional environment
...
It gives knowledge on organizational context, technical
knowledge, core skills/geneic skills, professional skills and technical skills
...

Successful candidates will be able ro understand the relationship between skill enhancement and
growth
...
obtain advice and guidance from appropriate people to develop their knowledge, skills
and competence
PC2
...
identify accurately their current level of knowledge, skills and competence and any
learning and development needs
PC4
...


undertake learning and development activities in line with their plan

PC6
...
obtain feedback from appropriate people on their knowledge and skills and how
effectively they apply them
PC8
...
It may also be defined as a distinct subset of the
economy whose components share similar characteristics and interests
...

Vertical may exist within a sub-sector representing different domain areas or the
client industries served by the industry
...


Function

Function is an activity necessary for achieving the key purpose of the sector,
occupation, or area of work, which can be carried out by a person or a group of
persons
...


Sub-functions

Sub-functions are sub-activities essential to fulfill the achieving the objectives of
the function
...


Occupational
Standards (OS)

OS specify the standards of performance an individual must achieve when
carrying out a function in the workplace, together with the knowledge and
understanding they need to meet that standard consistently
...


Performance
Criteria
National
Occupational
Standards (NOS)
Qualifications Pack
Code
Qualifications
Pack(QP)

Performance Criteria are statements that together specify the standard of
performance required when carrying out a task
...

Qualifications Pack Code is a unique reference code that identifies a qualifications
pack
...
A Qualifications Pack is
assigned a unique qualification pack code
...


Unit Title

Unit Title gives a clear overall statement about what the incumbent should be
able to do
...
This would be helpful to
anyone searching on a database to verify that this is the appropriate OS they are
looking for
...

Knowledge and Understanding are statements which together specify the
technical, generic, professional and organisational specific knowledge that an
individual needs in order to perform to the required standard
...

Technical Knowledge is the specific knowledge needed to accomplish specific
designated responsibilities
...
These skills are typically needed in any work
environment
...

Helpdesk is an entity to which the customers will report their IT problems
...

Description

IT-ITeS

Information Technology - Information Technology enabled Services

BPM

Business Process Management

BPO

Business Process Outsourcing

KPO

Knowledge Process Outsourcing

LPO

Legal Process Outsourcing

IPO

Information Process Outsourcing

BCA

Bachelor of Computer Applications

B
...


Bachelor of Science

OS

Occupational Standard(s)

NOS

National Occupational Standard(s)

QP

Qualifications Pack

UGC

University Grants Commission

MHRD

Ministry of Human Resource Development

MoLE

Ministry of Labour and Employment

NVEQF

National Vocational Education Qualifications Framework

NVQF
NSQF

National Vocational Qualifications Framework
National Skill Qualification Framework

Helpdesk

Page 23 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Nomenclature for QP & NOS UNITS
_____________________________________________________________________________
Qualifications Pack
9 characters
SSC/Q0101

SSC denoting Software & Services
Companies (IT-ITeS industry)
QP number (2 numbers)
Q denoting Qualifications Pack

National Occupational Standard
9 characters

Occupation (2 numbers)

SSC/N0101

SSC denoting Software & Services
Companies (IT-ITeS industry)
NOS number (2 numbers)
N denoting National Occupational
Standard

Occupation (2 numbers)

Occupational Standard

9 characters
SSC/N0101

SSC denoting Software & Services
Companies (IT-ITeS industry)
O denoting Occupational Standard

OS number (2 numbers)

Occupation (2 numbers)
It is important to note that an OS unit can be denoted with either an ‘O’ or an ‘N’
...
An example of OS unit
denoting ‘O’ is SSC/O0101
...
An example of OS unit denoting ‘N’ is SSC/N0101

Page 24 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

The following acronyms/codes have been used in the nomenclature above:
Sub-Sector

Range of Occupation numbers

IT Service (ITS)

01-20

Business Process Management (BPM)

21-40

Engg
...
1
Introduction to Analytics or R programming
Topic

Session Goals

Introduction to Analytics or R
programming

By the end of this session, you will be able to:
1
...

Use functions of R

Material and Handouts
Facilitator Material
Facilitator Guide, Handouts

Participant Material and Handouts
 Participants’ Guide

Session Plan:
Activity

Location

Knowing language R

Classroom

Using R as calculator

Classroom

Understanding components of R

Classroom

Reading database using R

Classroom

Importing & Exporting CSV

Classroom

Working on Variables

Classroom

Outliers and Missing Data treatment

Classroom

Combining Data sets in R

Classroom

Discuss Function and Loops

Classroom

Check your understanding

Classroom

Summary

Classroom

Page 26 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Unit 1
...
r-project
...

 R is the substrate on which we can mount various features using PACKAGES like
RCMDR- R Commander or R-Studio
...

Look at R!

R-Commander Interface

R-Studio Interface

Using R as calculator
R can be used as a calculator
...

Page 27 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Calculate the following using R:
1
...
23 X 32
3
...


Data Type:

There are two types of data classified on very broad level
...

 Numeric Data: - It includes 0~9, “
...

 Character Data: - Everything except Numeric data type is Character
...

Data is also classified as Quantitative and Qualitative
...
are Quantitative Data
...

For Example, “Good” can be rated as 9 while “Average” can be rated as 5 and “Bad” can be rated as 0
...
Data Frame:
A data frame is used for storing data tables
...

For example, here is a built-in data frame in R, called mtcars
...
Each horizontal line afterward
denotes a data row, which begins with the name of the row, and then followed by the actual data
...
To retrieve data in a cell, we would enter its row and column coordinates in
the single square bracket "[]" operator
...
In other words, the
coordinates begins with row position, then followed by a comma, and ends with the column position
...

For Example,
Here is the cell value from the first row, second column of mtcars
...
Array and Matrices:
We have two different options for constructing matrices or arrays
...

For example, you make an array with four columns, three rows, and two “tables” like this:
> my
...
array” is the name of the array we have given
...

There are 24 units in this array mentioned as “1:24” and are divided in three dimensions “(3, 4, 2)”
...
So, for arrays,
R fills the columns, then the rows, and then the rest
...
This is a little hack that goes a
bit faster than using the array ( ) function; it’s especially useful if you have your data already in a vector
...
)
Say you already have a vector with the numbers 1 through 24, like this:
> my
...
array simply by assigning the dimensions, like
this:
> dim(my
...
Defining the dimensions of the array as 3, 5 and 2
...
By using Vector method
...


Page 29 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Reading Database using R
We can import Datasets from various sources having various files types for example,

...
Each cell inside such data file is
separated by a special character, which usually is a comma, although other characters can be used as well
...
Here is a sample of
the expected format
...
csv" with a text editor, we can read the data
with the function read
...


> mydata = read
...
csv") # read csv file
> mydata
Col1 Col2 Col3
1 100 a1 b1
2 200 a2 b2
3 300 a3 b3

In various European locales, as the comma character serves as the decimal point, the
functionread
...
For further detail of the read
...
csv2 functions, please
consult the R documentation
...
csv)
 Big data tool – Impala
Cloudera 'Impala', which is a massively parallel processing (MPP) SQL query engine runs natively in
Apache Hadoop
...

Page 30 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

RImpala enables querying the data residing in HDFS and Apache HBase from R, which can be further
processed as an R object using R functions
...


This package is developed and maintained by MuSigma
...


>install
...
csv (“C:/Rtutorials/Sampledata
...
Similarly for writing dataset we use the write () function
...


•Help - ?function_name -> It is used to get help on any function in R
...

Name, Age, Sex
Shaan, 21, M
Ritu, 24, F
Raj, 31, M

Working on Variables
Before learning about creating and modifying variables in R we will know the various operators in R
...

Operator
Description
+

Addition

-

Subtraction

*

Multiplication

/

Division

^ or **

Exponentiation

x %% y

modulus (x mod y) 5%%2 is 1

x %/% y

integer division 5%/%2 is 2

Arithmetic Operators

Operator

Description

<
<=
>
>=
==
!=
!x
x|y
x&y

less than
less than or equal to
greater than
greater than or equal to
exactly equal to
not equal to
Not x
x OR y
x AND y

isTRUE(x)

test if X is TRUE

Logical Operators
Page 32 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

 Creating New variables:Use the assignment operator “<-” to create new variables
...


 Modifying existing variable:We can rename the existing variable by rename() function
...

For example,
If we want to rename variable based on some criteria like below
mydata$agecat<- ifelse(mydata$age> 70, c("older"), c("younger"))

 Create a new variable Total_Sales using variables Sales_1 and Sales_2
...


Outliers and Missing Data treatment
Inputting missing data using standard methods and algorithmic approaches (mice package R):
 In R, missing values are represented by the symbol NA (not available)
...
g
...

 Unlike SAS, R uses the same symbol for character and numeric data
...
na () function
...
T or True means that there is a missing
value
...
na(y)
# returns a vector (F FF T)
Page 33 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Arithmetic functions on missing values yield missing values
...
omit() function
...
omit(mydata)
Or, we can also use “na
...
From above example we use na
...

x <- c(1,2,NA,3)
mean(x, na
...

PMM-> Predictive Mean Matching (PMM) is a semi-parametric imputation approach
...


Outliers:
Outlier is a point or an observation that deviates significantly from the other observations
...

Normally we use BOX Plot and Scatter plot to find outliers from graphical representation
...
In most cases, you join
two data frames by one or more common key variables (i
...
, an inner join)
...
The two data frames must have
the same variables, but they do not have to be in the same order
...
Delete the extra variables in data frameA or
2
...

We use cbind() function to combine data by column the syntax is same as rbind()
...

We use rbind
...
It binds or combines a list of data frames filling missing columns
with NA
...
fill(mtcars[c("mpg", "wt")], mtcars[c("wt", "cyl")])
In this all the missing value will be filled with NA
...

We construct a “for” loop in R as follows:
for(i in values){

...

}
This for loop consists of the following parts:


The keyword for, followed by parentheses
...
In this example, we use i, but that can be any object name you
like
...




A vector with values to loop over
...




A code block between braces that has to be carried out for every value in the object values
...
Each time R loops through the code, R assigns the next value in
the vector with values to the identifier
...

You could do this with two if statements, but there’s an easier way in R: an if
...

An if…else statement contains the same elements as an if statement), and then some extra:


The keyword else, placed after the first code block



A second block of code, contained within braces, that has to be carried out if and only if the result of the
condition in the if() statement is FALSE

For example,
if(hours > 100) net
...
price * 0
...
price<- net
...
06 } else
{tot
...
price * 1
...
price)}
Or it can be written as also,
if(public) tot
...
price * 1
...
price<- net
...
12
Check Your Understanding

1
...
Linear modeling
b
...
Developing webapplications
2
...
What is the difference between rbind and cbind?
4
...
Use the dataset named “IRIS”
...

It is a freeware as well as open source software
...

We use “<-” as an assignment operator
...

Rbind and Cbind are used to join 2 or more datasets
...

IFELSE is used to 2 conditions simultaneously
...


Divide the class into groups of 4-5 participants

2
...


3
...


Each group presents their steps (5 min each)
Page 38 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Module 1: Unit – 1
...

 How to operationalize the plan
...


 Facilitator Guide
 Student
Handbook

 Create awareness on the common
Service Level Agreements
Session 4

Course
Conclusion

 Validate learning objectives have
been met

n/a

 Make final summary remarks
 Conduct final Q&A
 Conclude the course

Page 39 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Facilitator Preparation
Responsibilities
 Review examples provided: reflect on your own experiences and determine when to share
them
...

 Make sure the learning resources are loaded on your computer
...
Conduct a dress rehearsal of the session as you move
through the content
...

 Note that all examples are in italics to emphasize key learning points; however, you may use
your own professional experience to enhance the learning
...


Page 40 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Principles of Facilitating
Personal Experiences
As a facilitator, you lead participants through prepared scenarios and discussions
...
Often, personal
experiences on how you helped a colleague through the career ownership process and guided
them to achieving work satisfaction are more memorable than step-by-step instructions on
following the career ownership process
...
Also, participants are more likely to remember answers if they
have to think and explore on their own
...


Experiential Learning
This workshop includes exercises designed to help participants discover the principles of
guiding the participants through the career ownership process and career satisfaction
...
Make
liberal use of the whiteboard to capture and display critical participant insights
...
For
example:

Rather than saying…

Ask…

The Reality Check worksheet provides valuable
information about how time is currently spent and
what it would look like in the best case scenario
...

Introductions
I am and I am your facilitator today
...


Give a brief of your own experience and background
...


“Regardless of why you’re here today, we’re all going to walk away with some key benefits – let’s discuss
those briefly
...

“To fulfill these objectives today, we’ll be conducting a number of hands-on activities
...
Your participation will be crucial to your
learning experience and that of your peers here in the session today
...

Page 42 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

After participants give their views, debrief and bring to consensus

Question: Please share your thoughts on following?

A
...
Managing is only option – Prioritize

Page 43 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Importance of Time Management

Provide a brief overview of the session
...

Open up the discussion for the session and ask participants to share their thoughts on
“time management”?

The first part of this session discusses the following:
 “Plan better avoid wastage”
 Understanding the timelines of the deliverables
...

 It is important to value others’ time as well to ensure overall organizational timelines are met
 Share the perspective of how important is time specifically in a global time zone mapping
scenario

Page 44 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Why Time Management?
Ask the question to the partipants and gather responses
...


Share the SSC model and how working along several time zones is important for the Shared Services
Center
...
Refer to the Aspects of Time Management table in the Student
Workbook and identify the rules that employees/workers must follow
...
Refer to the Vocabulary Words table if you do not understand the meaning
of a word/term
...


Time Management Aspects

Prompt participants to come up with some aspects and relate them back to here
...
urgency, gauge tasks in terms of



Impact of doing them



Effect of not doing them

Main aim of prioritization is to avoid a crisis
We must

Schedule our Priorities
as opposed to
Prioritizing our Schedule
Time Management quadrants

1
...
Not Urgent and Important – Schedule on your calendar
3
...


Not Urgent Not Important – Delegate, Automate or Decline

Page 46 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Check Your Understanding

1
...

a
...
False

Suggested Responses:
False – Time once lost cannot be gotten back – hence important to plan time utilization properly

2
...
True
b
...
True or False? Time management is required both at individual level and
organizational level
...
True
b
...
True or False? Activities should be judged basis Urgency and Importance
c
...
False
Suggested Responses:

True – prioritization should be based on 2x2 matrix of urgency and importance

Page 47 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Team Exercise
List the items and ask participants to classify them as per the quadrant
...
Discuss the rationale of their thoughts and categorization
...
Refer to the Time Management Quadrant on the display / in the Student
Workbook and categorize the below items
...
Refer to the Vocabulary Words table if you do not understand the
meaning of a word/term
...
Wildly important goal
2
...
Busy work
4
...
Pressing problems
6
...
Planning
8
...
Professional development
10
...
Too many objectives
12
...
Major Deadlines
14
...
Meaningless management reports
16
...
Low priority email
18
...
Workplace gossip
20
...
Needless interruptions
22
...
Aimless Internet surfing
24
...
Wildly important goal – Q1
2
...
Busy work – Q4 – Consumes time however not pressing
4
...
Pressing problems – Q1 – has to be solved immediately
6
...
Planning – Q2 – Important but not urgent; should be done before crisis
8
...
Professional development – Q2
10
...
Too many objectives – Q3 – Prioritize further to establish which are important and
pressing
12
...
Major Deadlines – Q1
14
...
Meaningless management reports – Q3 – Prioritize further to establish which are
important and pressing
16
...
Low priority email – Q3 – Prioritize further to establish which are important and
pressing
18
...
Workplace gossip – Q4 – Non value add; occasionally creates negativity
20
...
To be done in spare and
leisure time
...

21
...
Defining contribution – Q2
23
...
Irrelevant phone calls – Q4 – Reserve and avoid

Page 49 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Summary



It is important to manage time
...

Preparing morning tea is a good example
...
Perfect execution to ensure good morning tea
!!! with family
...

Start the session by connecting the course content to the candidate responses
...
Describe the jobs in terms of major outcomes and link to the organization’s need
The first step in expectation setting is to describe the job to the employees
...
We need to feel out individual performance has an
impact on the organization’s mission
...
Share expectations in terms of work style
While setting expectation, it’s not only important to talk about the “what we do” but also on
“how we expect to do it”
...
Even if you have a solution, no one
likes surprises
...
If you see them doing
something poorly, tell them
...
Maximize Performance - Identify what is required to complete the work: Supervisor needs /
Employee needs
...
) but also the right levels of
direction (telling how to do the task) and support (engaging with employees about the task)
...
Establish priorities
...
Refer to earlier session
...
Revalidate understanding
...

6
...

Schedule an early progress check to get things started the right way, and agreed on
scheduled/unscheduled further checks
...
True or False? Setting expectations is best done after the employee
has worked for 6 months
...
True
b
...
True or False? Do not provide too many details when setting expectations
...
True
b
...
True or False? Always check to make sure there is a common understanding of
expectations
...
True
b
...
True or False? Try not to ask too many questions while setting expectations
...
True
b
...
True or False? Employees need to know what tasks to do and how to communicate,
appreciating work styles
...
True
b
...
True or False? Employees do not need to know how their work contributes to
organizational results
...
True
b
...
True or False? Employees need to know what their team members’ performance
problems are
...
True
b
...


8
...

a
...
False
Suggested Responses:
False, they need to adapt and respond based on the partners work style – understanding the work styles is very critical to
enhance team operating performance
...
Discuss key points on importance of defining quality
output
...
Why
and How to be
defined

M
Measurable
The output
metrics and
yardsticks should
be defined
...


R
Realistic
Should be
challenging yet
attainable
...


Pursuing Wrong goals but is Efficient

In efficient

Efficient
Use of Resources / Doing Things Right

Service Level Agreements
Service Level Agreement (SLA) is a contract between a service provider and its internal or external
customers that documents what services the provider will furnish
SLA measures the service provider’s performance and quality in a number of ways
...

SLAs, once established, should be periodically reviewed and updated to reflect changes in technology and
the impact of any new regulatory directives

Page 57 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Summary


Every activity must have defined goals and objectives
...




One must balance the efficiency and effectiveness while performing the tasks to
achieve the desired objectives
...


Course Conclusion
Course Conclusion

“We’ve almost reached the end of the course! Before we wrap up, let’s review what
we’ve learned today”
Ask the participants to recall key learning points from the session and map these
learning points to the course objectives
...
1
Summarizing Data and Revisiting Probability
Topic

Session Goals

Summarizing Data, and

By the end of this session, you will be able
to:

Revisiting Probability

1
...
Work on Probability
...

P(A)= S/P
Where S is sample size or no of positive outcomes and P is the population size or total no of outcomes
...


For example, the collection of all possible outcomes of a sequence of coin tossing is known to follow
the binomial distribution
...
Since the characteristics of these theoretical distributions are well
understood, they can be used to make
Page 60 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Statistical inferences on the entire data population as a whole
...

Now, “At Random” means that there is no biased treatment with any card and the result will be totally at
random
...
of Ace of Diamond in a pack = S = 1
Total no of possible outcomes = Total no
...
92% chance that we will get positive outcome
...


For example, the expected value of a dice roll is 3
...
5
...


The expected value is also known as the expectation, mathematical expectation, EV, mean, or first
moment
...
In other words, each possible value the random variable can assume is multiplied by its
probability of occurring, and the resulting products are summed to produce the expected value
...
The formal definition subsumes both of these and also works for distributions
which are neither discrete nor continuous: the expected value of a random variable is the integral of the
random variable with respect to its probability measure
...

Page 61 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Random & Bivariate Random Variables
 Random Variable:


A random variable, aleatory variable or stochastic variable is a variable whose value is subject to
variations due to chance (i
...
randomness, in a mathematical sense)
...




A random variable is a real-valued function defined on the points of a sample space
...
The realizations of a random variable,
that is, the results of randomly choosing values according to the variable's probability distribution function,
are called random variates
...
But we are sure that we will either get a head or a tail
...
For example flip of coin
...
Discrete
2
...

Based on Conditions, there are majorly 5 types PDFs
...

On several occasions, we have observed its occurrence in graphs from, apparently, widely differing sources:
the sums when three or more dice are thrown; the binomial distribution for large values of n; and in the
hyper geometric distribution
...

If

We say that X has a normal probability distribution
...


Importance of Normal Distribution:
Normal distribution is a continuous distribution that is “bell-shaped”
...

Normal distributions can estimate probabilities over a continuous interval of data values
...


Standard Normal Distribution

The normal distribution f(x), with any mean μ and any positive deviation σ, has the following
properties:
 It is symmetric around the point x = μ, which is at the same time the mode, the median and the mean
of the distribution
...

 Its density has two inflection points (where the second derivative of is zero and changes sign),
located one standard deviation away from the mean, namely at x = μ − σ and x = μ + σ
...

 Its density is infinitely differentiable, indeed super smooth of order 2
...

Test of Normal Distribution:
Normality tests are used to determine if a data set is well-modeled by a normal distribution and to compute
how likely it is for a random variable underlying the data set to be normally distributed
...


In frequentist statistics statistical hypothesis testing, data are tested against the null hypothesis that it
Page 65 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

is normally distributed
...

1
...
The empirical distribution of the data (the histogram) should be bell-shaped and resemble
the normal distribution
...
In this case one might proceed
by regressing the data against the quartiles of a normal distribution with the same mean and variance as the
sample
...
(see Anderson Darling
coefficient and Minitab)
A graphical tool for assessing normality is the normal probability plot, a quantile-quantile plot (QQ plot) of
the standardized data against the standard normal distribution
...
For normal data the points plotted in the QQ plot should fall approximately on a
straight line, indicating high positive correlation
...

Back-of-the-envelope test
Simple back-of-the-envelope test takes the sample maximum and minimum and computes their z-score, or
more properly t-statistic (number of sample standard deviations that a sample is above or below the sample
mean), and compares it to the 68–95–99
...

This test is useful in cases where one faces kurtosis risk – where large deviations matter – and has the
benefits that it is very easy to compute and to communicate: non-statisticians can easily grasp that "6σ
events are very rare in normal distributions"
...
Frequentist tests:
Tests of univariate normality include D'Agostino's K-squared test, the Jarque–Bera test, the Anderson–
Darling test, the Cramér–von Mises criterion, the Lilliefors test for normality (itself an adaptation of the
Kolmogorov–Smirnov test), the Shapiro–Wilk test, the Pearson's chi-squared test, and the Shapiro–Francia
test
...

Some published works recommend the Jarque–Bera test
...
It has low power
for distributions with short tails, especially for bimodal distributions
...

Historically, the third and fourth standardized moments (skewness and kurtosis) were some of the earliest
tests for normality
...
Mardia's
multivariate skewness and kurtosis tests generalize the moment tests to the multivariate case
...

More recent tests of normality include the energy test (Székely and Rizzo) and the tests based on the
empirical characteristic function (ecf) (e
...
Epps and Pulley, Henze–Zirkler, BHEP test)
...

The normal distribution has the highest entropy of any distribution for a given standard deviation
...

3
...
However, the ratio of expectations of these posteriors and the expectation of the
ratios give similar results to the Shapiro–Wilk statistic except for very small samples, when noninformative priors are used
...
This approach has been extended by Farrell and Rogers-Stewart
...

More specifically, where X1, …, Xn are independent and identically distributed random variables with the
same arbitrary distribution, zero mean, and variance σ2; and Z is their mean scaled by

Then, as n increases, the probability distribution of Z will tend to the normal distribution with zero mean
and variance (σ2)
...


The Poisson distribution with parameter λ is approximately normal with mean λ and variance λ, for
large values of λ
...


The Student's t-distribution t(ν) is approximately normal with mean 0 and variance 1 when ν is
large
...

For example, the path traced by a molecule as it travels in a liquid or a gas, the search path of a foraging
animal, the price of a fluctuating stock and the financial status of a gambler can all be modeled as random
walks, although they may not be truly random in reality
...

Random walks have been used in many fields: ecology, economics, psychology, computer science, physics,
chemistry, and biology
...
The random walk theory corresponds to the belief that markets
are efficient, and that it is not possible to beat or predict the market because stock prices reflect all available
information and the occurrence of new information is seemingly random as well
Check your understanding

1
...
What is Normal Distribution Curve and Why it is called as
Bell Curve?
3
...
What are the various types of Probability Distribution curves?
5
...




A random walk is a mathematical formalization of a path that consists of a succession of random steps
...




A random variable, aleatory variable or stochastic variable is a variable whose value is subject to
variations due to chance

1
...
Give the Dataset to the participants
...
Give 10 minutes to the class for each group to discuss the various
random variable examples process along with a discussion on the
methods type which they would like to use
4
...
(5 min
each)

CASE STUDY – Binomial Dist
...
Answer the following questions:
•What is the distribution of the number of chocolates with gift coupons in seven days?
•What is the probability that Amir gets no chocolates with gift coupons in seven days?
•Amir gets no gift coupons for the first six days of the week
...
What is the probability that he gets at least three gift
coupons?
•How many days of purchase are required so that Amir’s chance of getting at least one gift
coupon is 0
...
of trials
r is the number of successful outcomes
p is the probability of success, and
q is the probability of failure
...
Distribution of number of chocolates with gift coupons in 7 days: 7C r (1/6)r (5/6)7-r
2
...
Probability of winning a coupon on the 7th day: 1/6
4
...
0005+0
...
0163)
= 0
...
Number of purchase days required so that probability of success is greater than 0
...
95 = 1 – P(X ≤0) ≥ 0
...
05
= n ≥ 16
...


Page 71 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Module 1 - Unit – 2
...

 Review all material – Facilitator Guide, Presentation, Guides and Handouts (if any)
 Make sure you have copies of all the handouts
...

 Conduct a run through of the content
...
Make sure you are comfortable with the tools and interactions
recommended in the facilitator guide
...

 Make sure you create folders for all breakout activities
...
During this
process, relate your own professional experience to add realism
...
Sharing experiences helps participants understand how professionals work and
think, and gives them the opportunity to apply those lessons to their own work processes
...

Your goal is to foster independent thinking and action rather than having participants depend on
your experience
...
Encourage a freewheeling discussion and call out important trends and insights
...


Socratic Questions
Your goal throughout the session is to guide participants towards thinking through the scenarios
and discussion questions independently, rather than providing answer
...


What information can you gather from
the Reality Check worksheet and how
can the information be used to move
towards career satisfaction?

Page 74 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Topic: Team Work
Working Effectively

Welcome the participants to the course and move to the introductions
...

Briefly review the roles of the Lead Facilitator and Support Facilitator, if any
...


Why are you here today? [Course Objectives]
“Why are you here today?”
After reviewing and arranging responses, summarize the responses and map the responses to the suggested
course benefits below
...


Debrief the following:
Why are teams more popular??


Teams outperform individuals



Teams use employee talent better



Teams are more flexible and responsive to environmental changes in the organization
...


Page 75 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Topic: Team Work
Team Work

Ask participants to share their thoughts on:



What is team work?



How is it more advantageous?

What is a Team?
A team comprises a group of people linked in a common purpose
...


Team

Work

Coming together is a beginning, keeping together is progress and working together is success
...
In a good team members create an environment that allows everyone to go beyond
their limitation
...


Team Building

Page 76 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Trust

Communic
ation

Planning

Decision
Making

Problem
Solving

Team work vs
...
The teams that are
integrated in Spirit, Enthusiasm, Cohesiveness and Camaraderie are vitally important
...

After participants give their views, debrief and bring to consensus

Team Work: Pros and Cons

Page 78 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Check Your Understanding

1
...
The organizations that display a higher level fo team work are
generally more successful
...


Which one of the following is NOT a key attribute of Team Work?
a
...


Communication

c
...


Transparency

Suggested Answer:
c
...



Team work is essential to the success of every organization
...




Some of the fundamentals on which a team is built are: Collaboration,
Clear Expectations and Commitment
...
Discuss the importance of professional behavior in the
organization
...


Summarize
Who is professional?
A person who has achieved an acclaimed level of proficiency in any trade and whose competencies
can be measured against fixed set of standards or guidelines
...

 Positively proactive
...

 Respect
...

 Opportunities to help others
...

 Follow-up
...
Professionals make it a
habit to follow-up on everything and accept responsibility when they fail to engage in that
behavior
...
Professionals know how to be empathetic
...

 Self-confident
...
These individuals have a high sense of balanced self-esteem and role
awareness
...
Professionals are truly sustainable in that they can continue forward when times
become difficult
...

 Integrity
...

 Optimize all interactions
...
They look to see how one interaction can benefit someone else even before himself or
herself
...
Being flexible and open to change allows these individuals to be quick on their feet
and nimble to the opportunities that they encounter on a daily basis
...
Having a high level of awareness of themselves, the marketplace, the community
and even the world helps these individuals continually stay on top of things
...
Last, but not least, professionals demonstrate exceptional leadership skills and
even more importantly self-leadership skill
...

What is professionalism?



Professionalism is the competence or set of skills that are expected from a professional
...


How long does it take for someone to form an opinion about you?
 Studies have proved that it just takes six seconds for a person to form an opinion about
another person
...
It says that you are
someone who can be trusted and hence can maintain contact with you
...
This shows that you are enthusiastic
...

Clothing – Appropriate clothing says that you are a leader with a winning potential
...
Neutrals are not only restricted to grey
brown and off white - you can also take advantage of the beautiful navies, forest greens, burgundies, tans and caramel

Page 81 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

tones around
...

Things to remember



Wear neat clothes at work which are well ironed and do not stink
...




Women should avoid wearing revealing clothes at work
...


Check Your Understanding

1
...

a
...
False

Suggested Responses:
False

2
...
True
b
...
True or False? Well tailored Salwar Suit is not professional
...
True
b
...
Discuss the rationale of their
thoughts and categorization
...
Polo T Shirt –
2
...
Collared Shirt 4
...
Leather laced Shoes –
6
...
Backpacks –
8
...
Jeans on weekdays –
10
...
Matching Belt and Shoes –
12
...
Knee Length Skirt –
14
...
Obvious Tatoos –
Suggested Answers:
1
...

3
...

5
...

7
...

9
...

11
...

13
...

15
...




Empathy, Positive Attitude, Teamwork, Professional Language,
Knowledge, Punctuality, Confidence are some of the key characteristics
that determine the professionalism of a person
...


Page 84 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Key Points
Effective Communication

Provide a brief overview of the session
...


We would probably all agree that effective communication is essential to workplace effectiveness
...
The purpose of building communication skills is to
achieve greater understanding and meaning between people and to build a climate of trust,
openness, and support
...
And a big part of working well with other people is communicating effectively
...
So, let’s have
an experience that reminds us of the importance of effective communication
...


Activity Description:
Ask the participants to share an experience that reminds them of the significance of
effective communication OR consequences of ineffective communication
...

The question is: Are we communicating what we intend to communicate?
Does the message we send match the message the other person receives?
Impression = Expression
Real communication or understanding happens only when the receiver’s impressionmatches what
the sender intended through his or her expression
...

In simple terms, effective communication means this
...

I get it
...
Until a message is complete, the best we can say about its meaning is this:
The meaning of a message is not what is intended by the sender, but what is
understood by the receiver
...
It takesadding one
more step
...

In simple terms, complete or effective communication means
...

I get it
...

So far, then, we’ve defined effective communication and what makes it complete
...


Forms of Communication:

Page 86 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

The most common way in which we communicate is by talking to the other person
...


Verbal communication

2
...
Written communication

Verbal Communication
Verbal communication refers to the use of sounds and language to relay a message
...
In combination with nonverbal forms
of communication, verbal communication acts as the primary tool for expression between two or more people
...
Whereas public speaking involves one or more people delivering a
message to a group, interpersonal communication generally refers to a two-way
exchange that involves both talking and listening
...
It encompasses everything from simple onesyllable sounds to complex discussions and relies on both language and emotion to
produce the desired effect
...


Non Verbal Communication
How do we communicate without words???

 We communicate a lot to each other outside what we say
...

(Intuitively, we generally view others’ “body language” as a more reliableindicator of their
attitudes and feelings than their words
...

- The key is discovering an individual’s behavior patterns—there is predictability to their
meaning
...

Page 87 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

- Also, trying to read something into every movement others make can get in the way of
effective interactions
...
Ambulation is the way one walks
...

2
...
People
communicate trust, compassion, tenderness, warmth, and other feelings through touch
...
Some people are
“touchers” and others emit signals not to touch them
...
Eye contact is used to size up the trustworthiness of another
...

Speakers use eye contact to keep the audience interested
...
Posturing can constitute a set of potential signals that communicate how a person is
experiencing the environment
...
On the other hand, the person may just be
cold
...
Tics are involuntary nervous spasms that can be a key to indicate one is being threatened
...
But these mannerisms
can easily be misinterpreted
...
Sub-vocals are the non-words one says, such as “ugh” or “um
...
People use a lot of non-words trying to convey a message to
another person
...
" It is used in place of the "ugh"
and other grunts and groans commonly used
...
Distancing is a person’s psychological space
...
” People may try to move back to reestablish their
personal space
...

8
...
This is especially true between cultures
...

9
...
For example, the message, “I trust you,” can have many meanings
...
“I trust you” could imply strong sincerity
...

Written Communication
Written communication involves any type of message that makes use of
the written word
...


Examples of written communications generally used with clients or other
businesses include email, Internet websites, letters, proposals, telegrams,
faxes, postcards, contracts, advertisements, brochures, and news releases
...

- It is a permanent means of communication
...

- Written communication is more precise and explicit
- Effective written communication develops and enhances organization’s image
- It provides ready records and references
- Written communication is more precise and explicit
...
It costs huge in terms of stationery
and the manpower employed in writing/typing and delivering letters
...

- Written communication is time-consuming as the feedback is not immediate
...

-

Effective written communication requires great skills and competencies in language and
vocabulary use
...


-

Too much paper work and e-mails burden is involved

Common Etiquettes In Written Communication
Continuing with the series of etiquettes in communication, language expert Preeti Shirodkar tells us about what we
need to keep in mind while communicating in writing
...

1 – Structuring of the Content
Introduction, Body and Conclusion: While writing one should ensure that the content is well organized, with the
overview/basic details comprising the introduction; all major points with their explanation and exemplification
constituting the body (preferably divided into a separate paragraph each for every new point, with titles and subtitles, if
necessary)
...

Moreover, care should be taken to ensure that the flow is not brought about through a forced/deliberate use of
connectives, as this make the piece extremely uninteresting and artificial
...

Moreover, short forms can at time be culture specific or even organization specific and may thus unnecessarily
complicate the communication
...
So too, spellings can create the same effect or can even reflect a careless attitude on part of the sender
...


Page 90 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics
5 – Sensitivity to the Audience
One needs to be aware of and sensitive to the emotions, need and nature of the audience in choosing the vocabulary,
content, illustrations, formats and medium of communication, as a discomfort in the audience would hamper rather than
facilitate communication
...

This is especially true in the case of all detailed writing that seeks to hold the readers' attention
...


Some Do’s and Don’ts of Writing



Be Specific: Just like a reporter, communicate the “who, what, where, why, when and how”
of what needs to done
...




Avoid the Passive Voice: Instead of writing “The program was planned by Dane,” write,
“Dane planned the program
...
Get to the point
...
And
also make sure that you do a careful proof of your work
...
However, if you’re
writing for a formal audience, like a proposal to the board of directors, be more formal with
your language
...
This
will help you determine if you’ve used incorrect words, if your sentences run on too long, if
your tenses don’t match, and more
...


Common barriers to effective Communication:

Page 91 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

1
...
Over Over-complicated, unfamiliar and/or technical terms
...
Emotional barriers and taboos
...

3
...

4
...

5
...

6
...

Not being able to see the non-verbal cues, gestures,
posture and general body language can make communication less effective
...

7
...

8
...
People often hear what they expect to hear rather than what is actually said and
jump to incorrect conclusions
...
Cultural differences
...
For example, the concept of personal space varies
between cultures and between different social settings
...
True or False? A good definition of communication is the sending of
information from
one person to another
...
True
b
...
True or False? Good working relationships between people form an important
foundation for effective communication
a
...
False

Page 92 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Suggested Responses:

True

3
...

a
...
False
Suggested Responses:

True

4
...
True
b
...
True or False? A person’s attitude toward the value of communication is
more important than the skills or methods used to communicate
...
True
b
...
True or False? Everyone should be responsible for effective upward,
downward, and
horizontal communication
...
True
b
...
True or False? A sender has failed to communicate unless the receiver
understands the
message the way the sender intended it
...
True
b
...
True or False? The grapevine is usually an accurate source of information,
and should
be used intentionally to communicate
...
True
b
...
True or False? If people don’t understand, they will usually indicate so by
asking
questions or by saying they don’t understand
...
True
b
...
True or False? In order to have an effective communication program, top
management
must take an active part
a
...
False
Suggested Responses:

True

11
...

a
...
False
Suggested Responses:

True

12
...
True
b
...
True or False? The best way to be sure we understand a communication is to
repeat it
back to the communicator
...
True
b
...
True or False? The use of effective visual aids by a speaker usually provides
a
significant increase in the audience’s understanding of the message
a
...
False

Suggested Responses:

True
15
...

a
...
False
Suggested Responses:

True

16
...

a
...
False

Suggested Responses:

True

17
...

a
...
False
Suggested Responses:

True

Page 96 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

18
...

a
...
False
Suggested Responses:

True
19
...

a
...
False
Suggested Responses:

True
Summary



The purpose of effective communication skills is to achieve greater
understanding between people that builds a climate of trust, openness,
and support
...


 Lack of attention and interest, use of jargons and language differences
are some of the common communication barriers
...


Page 97 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Module 1 : Unit – 3
SQL using R
Session Time: 180 minutes
Topic

Activities

SQL using R

By the end of this session, you will be able
to:
1
...
Work on Excel and R integration
...

SQL using R:
It is sqldf, an R package for running SQL statements on data frames
...

NoSQL databases are increasingly used in big data and real-time web applications
...

There have been various approaches to classify NoSQL databases, each with different categories
and subcategories, some of which overlap
...
SQL Summary
SQL Databases
One type (SQL database) with minor variations

NOSQL Databases
Many different types including key-value
stores, document databases, wide-column stores,

Types

and graph databases

Development
History

Developed in 1970s to deal with first wave of

Developed in 2000s to deal with limitations of

data storage applications

SQL databases, particularly concerning scale,
replication and unstructured data storage

MySQL, Postgres, Oracle Database

MongoDB, Cassandra, HBase, Neo4j

Individual records (e
...
, "employees") are

Varies based on database type
...
g
...
), much like a

with more complex information sometimes stored

Data Storage

spreadsheet
...
Document databases

Model

separate tables, and then joined together when

do away with the table-and-row model altogether,

more complex queries are executed
...
When a user wants

which can nest values hierarchically
...

Structure and data types are fixed in advance
...
Records can add new
information on the fly, and unlike SQL table rows,

entire database must be altered, during which

dissimilar data can be stored together as necessary
...


For some databases (e
...
, wide-column stores), it is
somewhat more challenging to add new fields
dynamically
...
It is possible to spread

commodity servers or cloud instances
...


required
...
g
...
g
...
g
...
database level)

Specific language using Select, Insert, and

Through object-oriented APIs

Update statements, e
...
SELECT fields FROM
table WHERE…
Can be configured for strong consistency

Consistency

Open-source

Depends on product
...
g
...
g
...

require(gdata)
myDf<- read
...
xlsx"), sheet = 1, header = TRUE)
RODBC: This is reported for completeness only
...

XLConnect: It might be slow for large dataset but very powerful otherwise
...
xlsx")
myDf<- readWorksheet(wb, sheet = "Sheet1", header = TRUE)
xlsx: Prefer the read
...
xlsx(), it’s significantly faster for large dataset
...
xlsx2("myfile
...
It’s rather fast but doesn’t support
...
It has been removed from CRAN lately
...
table(“clipboard”): It allows to copy data from Excel and read it directly in R
...

myDf<- read
...
frame then read this file in Excel
...
table
...
csv which uses “
...
csv2 which uses a comma for the decimal point and a semicolon for
the separator
...
table(x,"your_path",sep=",",row
...
You can run a batch
file within the VBA code
...
exe is in your PATH, the general syntax for the batch file (
...
R
Here’s an example of how to integrate the batch file above within your VBA code
...
Generally speaking once you installed RExcel you insert the
excel code within a cell and execute from RExcel spreadsheet menu
...

5 – Execute VBA code in R
This is something I came across but I never tested it myself
...
First write a
VBscript wrapper that calls the VBA code
...

The method is described in full details here
...
It allows communication in both directions: Excel to R and R to
Excel and covers most of what is described above and more
...

There is a wiki for installing RExcel and an excellent tutorial available here
...
They both give an in-depth view of RExcel capabilities
...

2
...

4
...

 A NoSQL (originally referring to "non SQL" or "non-relational") database provides a
mechanism for storage and retrieval of data that is modeled in means other than the tabular
relations used in relational databases
...

 NoSQL is used primarily in compliment to Big Data tools
...

 How to execute VBA code in R tool?

1
...
Give the Dataset to the participants
...
Give 10 minutes to the class for each group to discuss the various change points
between SQL and NO SQL along with a discussion on the methods type which
they would like to use
4
...
(5 min each)

Page 104 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Module 1 UNIT – 4
Correlation and Regression
Topic

Session Goals

Correlation and Regression

By the end of this session, you will be able
to:
1
...
Find Correlation
3
...
Work on Multiple Regression
5
...
e
...

In simple regression, we try to determine whether there is a relationship between two variables
...


In R we use lm () function to do simple regression modeling
...
Fitted, Normal Q-Q, Scale-Location, and
Residuals vs
...

Below are the various graphs representing values of regression

Page 106 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

OLS Regression
OLS:- Ordinary least squares (OLS) or linear least squares is a method for estimating the unknown
parameters in a linear regression model, with the goal of minimizing the differences between the
observed responses in some arbitrary dataset and the responses predicted by the linear approximation of
the data
...
e
...


Regression Modeling

 Regression modeling or analysis is a statistical process for estimating the relationships among
variables
...
More specifically, regression analysis helps one understand how the
typical value of the dependent variable (or 'criterion variable') changes when any one of the
independent variables is varied, while the other independent variables are held fixed
...
Less commonly, the focus is on a quantile, or other location
parameter of the conditional distribution of the dependent variable given the independent
variables
...
In regression analysis, it is also of interest to characterize the variation of
the dependent variable around the regression function which can be described by a probability
distribution
...
Regression analysis is also used to understand which
among the independent variables are related to the dependent variable, and to explore the forms
of these relationships
...
However this can lead to
illusions or false relationships, so caution is advisable; for example, correlation does not imply
causation
...
Familiar methods
such as linear regression and ordinary least squares regression are parametric, in that the
Page 107 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

regression function is defined in terms of a finite number of unknown parameters that are
estimated from the data
...

 The performance of regression analysis methods in practice depends on the form of the data
generating process, and how it relates to the regression approach being used
...
These assumptions are sometimes
testable if a sufficient quantity of data is available
...
However, in many applications, especially with small effects or questions of
causality based on observational data, regression methods can give misleading results
...
The case of a
continuous output variable may be more specifically referred to as metric regression to
distinguish it from related problems
...

Because a linear regression model is not always appropriate for the data, you should assess the
appropriateness of the model by defining residuals and examining residual plots
...
Each data point has one residual
...
That is, Σ e = 0 and e = 0
...
If the points in a residual plot are randomly dispersed around the horizontal axis, a
linear regression model is appropriate for the data; otherwise, a non-linear model is more appropriate
...

Page 108 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

x

60

70

80

85

95

y

70

65

70

95

85

ŷ

65
...
849

78
...
507

87
...
589

-6
...
288

13
...
945

The residual plot shows a fairly random pattern - the first residual is positive, the next two are negative,
the fourth is positive, and the last residual is negative
...

Below, the residual plots show three typical patterns
...
The other plot patterns are non-random (U-shaped and inverted U),
suggesting a better fit for a non-linear model
...
2 : No or very weak association
•0
...
4 : Weak association
•0
...
6 : Moderate association
•0
...
8 : Strong association
•0
...

Correlation is defined in terms of the variance of x, the variance of y, and the covariance of x and y (the
way the two vary together; the way they co-vary) on the assumption that both variables are normally
distributed
...

We denote the covariance of x and y by cov(x, y), after which the correlation coefficient r is defined as

Page 110 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Heteroscedasticity:
A collection of random variables is heteroscedastic (or 'heteroskedastic' from Ancient Greek hetero
“different” and skedasis “dispersion”) if there are sub-populations that have different variabilities from
others
...
Thus heteroscedasticity is the absence of homoscedasticity
...
For instance, while the ordinary least squares estimator is stillunbiased in the
presence of heteroscedasticity, it is inefficient because the true variance and covariance are
underestimated
...

Test of Heteroscedasticity:-

Tests in regression


Levene's test



Goldfeld–Quandt test



Park test



Glejser test



Brown–Forsythe test



Harrison–McCabe test



Breusch–Pagan test



White test



Cook–Weisberg test

Tests for grouped data


F-test of equality of variances



Cochran's C test



Hartley's test

These tests consist of a test statistic (a mathematical expression yielding a numerical value as a
function of the data), a hypothesis that is going to be tested (the null hypothesis), an alternative
hypothesis, and a statement about the distribution of statistic under the null hypothesis
...
They are:
 View logarithmized data
...
The variability in percentage terms
may, however, be rather stable
...

 Apply a weighted least squares estimation method, in which OLS is applied to transformed or
weighted values of X and Y
...
In one variation the weights are directly related to the magnitude of
the dependent variable, and this corresponds to least squares percentage regression
...
HCSE is a consistent estimator of standard errors in regression models with
heteroscedasticity
...
This method may be superior to regular OLS because if heteroscedasticity is
present it corrects for it, however, if the data is homoscedastic, the standard errors are
equivalent to conventional standard errors estimated by OLS
...


Page 112 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Autocorrelation

Autocorrelation, also known as serial correlation or cross-autocorrelation, is the cross-correlation of a
signal with itself at different points in time (that is what the cross stands for)
...
It is a mathematical tool for
finding repeating patterns, such as the presence of a periodic signal obscured by noise, or identifying
the missing fundamental frequency in a signal implied by its harmonic frequencies
...

In statistics, the autocorrelation of a random process describes the correlation between values of the
process at different times, as a function of the two times or of the time lag
...
(i may be an integer for a discretetime process or a real number for a continuous-time process
...
Suppose that the process is further known to have
defined values for mean μi and variance σi2 for all times i
...

Test: 

The traditional test for the presence of first-order autocorrelation is the Durbin–Watson statistic
or, if the explanatory variables include a lagged dependent variable, Durbin's h statistic
...




A more flexible test, covering autocorrelation of higher orders and applicable whether or not the
regressors include lags of the dependent variable, is the Breusch–Godfrey test
...
The simplest version of the test statistic from this auxiliary regression is TR2, where T
is the sample size and R2 is the coefficient of determination
...


Page 113 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Multicollinearity
In statistics, multicollinearity (also collinearity) is a phenomenon in which two or more predictor
variables in a multiple regression model are highly correlated, meaning that one can be linearly
predicted from the others with a substantial degree of accuracy
...
Multicollinearity does not reduce the predictive power or reliability of the model as a
whole, at least within the sample data set; it only affects calculations regarding individual predictors
...

In case of perfect multicollinearity the predictor matrix is singular and therefore cannot be inverted
...

Test:Indicators that multicollinearity may be present in a model:
1)
Large changes in the estimated regression coefficients when a predictor variable is added or
deleted
2)
Insignificant regression coefficients for the affected variables in the multiple regression, but a
rejection of the joint hypothesis that those coefficients are all zero (using an F-test)
3)
If a multivariable regression finds an insignificant coefficient of a particular explanator, yet a
simple linear regression of the explained variable on this explanatory variable shows its coefficient to
be significantly different from zero, this situation indicates multicollinearity in the multivariable
regression
...
A tolerance of less than 0
...
10 and/or a VIF of 5 or 10 and above indicates a
multicollinearity problem
...
It will indicate that the inversion of the matrix is numerically unstable with finite-precision
numbers (standard computer floats and doubles)
...
The Condition Number is computed by
finding the square root of (the maximum eigenvalue divided by the minimum eigenvalue)
...
One advantage of this method is that it also shows which variables
are causing the problem
...
C
...
The Farrar–Glauber test has also been criticized by
other researchers
...
Multicollinearity can be detected by adding random noise to the data and
re-running the regression many times and seeing how much the coefficients change
...

Correlation values (off-diagonal elements) of at least
...
This procedure is, however, highly problematic and cannot be
recommended
...

Introduction to Multiple Regression

The general purpose of multiple regressions (the term was first used by Pearson, 1908) is to learn more
about the relationship between several independent or predictor variables and a dependent or criterion
variable
...
Once this information has been compiled for various houses it
would be interesting to see whether and how these measures relate to the price for which a house is
sold
...


Page 115 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Dummy Variables

In regression analysis, a dummy variable (also known as an indicator variable, design variable, Boolean
indicator, categorical variable, binary variable, or qualitative variable) is one that takes the value 0 or 1
to indicate the absence or presence of some categorical effect that may be expected to shift the
outcome
...

In other words, Dummy variables are "proxy" variables or numeric stand-ins for qualitative facts in a
regression model
...
), but also by qualitative variables (gender, religion,
geographic region, etc
...

For example,
Suppose Gender is one of the qualitative variables relevant to a regression
...
If female is arbitrarily assigned the value
of 1, then male would get the value 0
...


Page 116 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Check your understanding

In the context of regression analysis, which of the following statements are
true?
I
...

II
...

III
...

(A) I only
(B) II only
(C) III only
(D) I and II
(E) I and III

Summary









Solution

Regression is method of establishing relation between two or more variables
...
A random pattern of residuals supports a linear model; a
The correct answer
Correlation coefficient lies betweenpattern supports a non-linear model
...

Ordinary least squares (OLS) or linear least the data setais linear or nonlinear
...

Multiple regressions find relationship between several independent or predictor variables
and a dependent or criterion variable
...

A dummy variableis one that takes the value 0 or 1 to indicate the absence or presence of
some categorical effect
...
Solve Engg
...
Issues
2
...
Engineering
Design, Manufacturing, smart utilities,
production lines, Automotive industries,
Tech system

Classroom

Understand the business problem Related to
engineering, Identify the critical issues
...


Classroom

Requirement gathering

Classroom

Summary

Classroom

Page 118 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

Step-by-Step

Understand systems viz
...
The process is highly iterative - parts of the process often need
to be repeated many times before production phase can be entered - though the part(s) that get
iterated and the number of such cycles in any given project can be highly variable
...

Manufacturing:
Manufacturing is the production of merchandise for use or sale using labour and machines, tools,
chemical and biological processing, or formulation
...
Such finished goods may
be used for manufacturing other, more complex products, such as aircraft, household appliances or
automobiles, or sold to wholesalers, who in turn sell them to retailers, who then sell them to end
users – the "consumers"
...
In a free market economy,
manufacturing is usually directed toward the mass production of products for sale to consumers at
a profit
...
In mixed market economies, manufacturing occurs under some
degree of government regulation
...
Some industries, such as semiconductor and steel
manufacturers use the term fabrication instead
...
Examples
of major manufacturers in North America include General Motors Corporation, General Electric,
Procter & Gamble, General Dynamics, Boeing, Pfizer, and Precision Cast parts
...
Examples in Asia include Sony,
Huawei, Lenovo, Toyota, Samsung, and Bridgestone
...
M
...
R
...
(Self-Monitoring, Analysis and Reporting Technology; often written as SMART) is a
monitoring system included in computer hard disk drives (HDDs) and solid-state drives (SSDs) that
detects and reports on various indicators of drive reliability, with the intent of enabling the anticipation
of hardware failures
...
M
...
R
...
data indicates a possible imminent drive failure, software running on the host
system may notify the user so stored data can be copied to another storage device, preventing data
loss, and the failing drive can be replaced
...
Set business
objectives
...
In
the process, organizations may also determine strategies to guide operations and help achieve
competitive advantages
...
The latter, identifying opportunities can be viewed as a
problem of strategy choice requiring a solution
...
They
include the following:





Review business plans, existing models and other documentation
Interview subject area experts
Conduct fact-finding meetings
Analyze application systems, forms, artifacts, reports, etc
...
Large meetings are not a good use of time for data gathering
...
They are also useful to prioritize final business requirements
...


Primary or local data is collected by the business owner and can be collected by survey, focus group
or observation
...
While
easy to get (if you have the cash) this data is not specific to your business and can be tough to sort
through as you often get quite a bit more data than you need to meet your objective
...

Three key questions you need to ask before making a decision about the best method for your firm
...
BI technologies are capable of
handling large amounts of unstructured data to help identify, develop and otherwise create new
strategic business opportunities
...
Identifying new opportunities and implementing an effective strategy based on
insights can provide businesses with a competitive market advantage and long-term stability
...
Common
functions of business intelligence technologies are reporting, online analytical processing, analytics,
data mining, process mining, complex event processing, business performance management,
benchmarking, text mining, predictive analytics and prescriptive analytics
...

Basic operating decisions include product positioning or pricing
...
In all cases, BI is most effective when it combines
data derived from the market in which a company operates (external data) with data from company
sources internal to the business such as financial and operations data (internal data)
...


Business intelligence is made up of an increasing number of components including:

Page 122 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics











Multidimensional aggregation and allocation
Denormalization, tagging and standardization
Realtime reporting with analytical alert
A method of interfacing with unstructured data sources
Group consolidation, budgeting and rolling forecasts
Statistical inference and probabilistic simulation
Key performance indicators optimization
Version control and process management
Open item management

Business intelligence can be applied to the following business purposes, in order to drive business value
...

Analytics – program that builds quantitative processes for a business to arrive at optimal
decisions and to perform business knowledge discovery
...

Reporting/enterprise reporting – program that builds infrastructure for strategic reporting to
serve the strategic management of a business, not operational reporting
...

Collaboration/collaboration platform – program that gets different areas (both inside and
outside the business) to work together through data sharing and electronic data interchange
...
Knowledge management leads to learning
management and regulatory compliance
...
For example, if some
business metric exceeds a pre-defined threshold, the metric will be highlighted in standard reports, and
the business analyst may be alerted via e-mail or another monitoring service
...

Data can be always gathered using surveys
...
Keep it VERY simple
...
Customers are
Page 123 of 125

Facilitator Guide – SSC/ Q2101 – Associate Analytics

visiting to purchase or to have an experience, not to fill out surveys
...
Choose only one objective for the survey
...

3
...
Open ended questions are tough to
manage
...

4
...
Why not? But rather than name and e-mail (leading to concerns
with confidentiality and often less than truthful answers) gather gender, age and income; you
might be surprised at who is actually buying what
...
What are various steps involved Organization Decision
making?
2
...
What do you understand by Production Lines?
4
...

 Manufacturing is the production of merchandise for use or sale using labor and machines,
tools, chemical and biological processing, or formulation
...
It reduces production time drastically
...


1
...
Give the Dataset to the participants
...
Give 10 minutes to the class for each group to discuss on an issue of an
automobile giant X where the quality of Bikes is not as per standard
...
Along with a discussion on the methods
type which they would like to use
4
...
(5 min each)

Page 125 of 125


Title: Associate Analytics
Description: This Facilitators Guidebook for the Associate Analytics program contains detailed facilitation guidelines as well as the exhaustive course material for the Associate Analytics program.