Data mining is looking for hidden, valid, and all the possible useful patterns in large size data...
SAS is a command-driven statistical software suite widely used for statistical data analysis and visualization. SAS full form is Statistical Analysis Software. It allows you to use qualitative techniques and processes which help you to enhance employee productivity and business profits. SAS is also used for advanced analytics like business intelligence, crime investigation, and predictive analysis. SAS is pronounced as "SaaS."
In SAS, data is extracted & categorized which helps you to identify and analyze data patterns. It is a software suite which allows you to perform advanced analysis, Business Intelligence, Predictive Analysis, data management to operate effectively in the competitive & changing business conditions. Moreover, SAS is platform independent which means you can run SAS on any operating system either Linux or Windows.
Compared to other BI tools, SAS provides extensive support to programmatically transform and analyze data, apart from using the drag and drop interface. This provides very granular control over data manipulation and analyzes which is its USP.
In this SAS tutorial for beginners, you will learn
Let's understand the need for SAS with a simple example:
Consider an e-commerce company that wants to know the buying patterns of its customers based on historical data. The company will have to consider thousands of records of multiple customers, to get the generalize insight.
The company may not have all these data required for the analysis. For example, if a customer did not buy a Jacket, then what are the factors which stopped them not buy the Jacket? This missing data could create errors in your analysis. How can you we get rid of these problems? How can we handle this type of data?
If done manually, this task will require hundreds of analysts and thousands of man-hours. By using SAS analytic tool, you can do the same analysis in a matter of hours with a single analyst. SAS tool allows you to eliminate unnecessary data and optimize the relevant information. It will enable you to predict an outcome even with missing data. SAS enables you to take better decisions.
R: It is open-source software. It is easy to learn R as it is well documented. It offers strong statistical capabilities.
Python is another popular open-source scripting language. It is supports libraries such as Numpy, Scipy, and MatPlotLib. You can perform any statistical operation, or you can build any model using these libraries.
SAS: It is the widely used analytical tool in the commercial analytics market. With a plethora of statistical functions and good GUI.
In this SAS programming tutorial, we will discuss about Statistical Analytical Systems, and how it can be used to solve our problems.
Next in this SAS language tutorial, we will learn about features of SAS.
Key features of SAS are:
Next in this SAS for beginners tutorial, we will learn about SAS Product suite.
There are many SAS products is available in the market. Following is a list of the more popular ones.
|Base SAS||Base SAS software offers hardware agility and integrates into all kind of computing environment.|
|SAS/GRAPH||This tool helps you to represent structured data into graphs.|
|SAS/STAT||This tool helps you to perform different types of regression, statistical analysis variance, regression, and psychometric analysis.|
|SAS/ETS||It is used for forecasting. Helps you to perform the time series analysis.|
|SAS/IML||Interactive Matric language is known as IML. This tool helps you to translate mathematical formulas into an innovative program.|
|SAS EBI||A tool for Business Intelligence Applications|
|SAS Grid Manager||It is a core component which offers data management facility and a programming language for data analysis|
|SAS/OR||Tool for Operation research|
|SAS/QC||Use for Quality control|
|SAS/Enterprise Miner||Data mining|
|SAS/PH||Clinical trial analysis|
|SAS/AF||It offers applications facility|
|Enterprise Guide||It is a GUI based code editor & project manager|
Next in this SAS tutorials guide, we will learn about SAS architecture.
SAS architecture is divided mainly of three parts:
Client tier is where the application is installed on a machine, where the user is sitting. It consists of the components which are used to view the portal and its content. It also includes a standard web browser that is used to interact with the portal over standard HTTP or HTTPS protocol. It also helps you to make the SAS web application firewall friendly.
The middle tier offers a centralized access point for enterprise information. All access to content is processed by components operating of this tier. The separation of the business logic with display logic helps you to leverage the logic of the middle tier. Moreover, centralized points of access make it easier to enforce security rules, administer the portal and manage code changes.
The middle tier hosts the following functions:
SAS Information Delivery Portal Web Application: It is the collection of JSP, Java servlets, JavaBeans, and other classes and resources. These components help you to access information stored in the enterprise directory to create a customizable interface for the user.
Servlet Engine: The servlet engine is also called a servlet container. It is responsible for managing the SAS Information Delivery Portal Web Application. The servlet engine offers a run time environment. It provides concurrency, deployment, lifecycle management, etc.
Web server: Web server offers service for the servlet engine which can be used to host website. This should be accessed using the portal.
The back tier is an area where the data and computation servers run which may contain business objects. It is an enterprise directory server. The enterprise directory server maintains metadata about content which is located throughout the enterprise.
Local Download in your machine
Step 1) Download SAS from given link
Go to this link https://www.sas.com/en_in/software/university-edition.html and click on Get Free Software.
Step 2) Select your Operating System
Select the operating system as per your system.
Step 3) Download and install Virtualization Software
SAS requires Virtualization Software like VirtualBox to be installed before it can be installed. Here are the detailed steps
Follow the steps mentioned onscreen to install SAS. Having VirtualBox and local install could sometime be tricky. We recommend AWS installation-
You can deploy SAS in AWS. It's eligible for free tier.
Step 1) Go to https://aws.amazon.com/marketplace/pp/B00WH10IKW. Click "Continue to Subscribe"
Step 2) In this next screen, Accept Terms.
Step 3) The subscription is pending takes up to 10 minutes to approve. You will see the following screen.
Step 4) Refresh the page, and you will subscription confirmed. Click on Continue to Configuration
Step 5) Keep settings default and click Continue to Launch.
Step 6) Review the config page. Enter a key-value pair. Rest settings should be the default. Click Launch
Step 7) Go to https://aws.amazon.com/marketplace/library/ and click on View Instances.
Step 8) In the popup
Step 9) In the popup, that appears after you click in step 8
Step 10) You will see the welcome screen.
If you are not able to connect, go to https://console.aws.amazon.com/ec2/v2/home?region=us-east-1#SecurityGroups:sort=groupId and inbound/outbound rules to all
To effectively use SAS software you need to follow four steps which are: Access Data, Management Data, Analyze, Present
SAS allows you to access data in any desired format that you want.
You can access data that is stored anywhere, whether it is in a file on your system or data that is stored in another database system. It can be oracle file, SAS database file, Raw Database file or a simple XLS /CSV file. It will help you to access this data with ease.
SAS offers great data management capabilities. You can subset/slice data based on certain conditions, create variable, clean & validate data. There are other tools which allow you to perform the same task. However, SAS helps you to perform this job with ease.
SAS has well-defined libraries and processes which makes the programming process easy. Moreover, creating variable or subset data is just one step process. This saves you from writing complex algorithms by just a single line of code.
You can do various kinds of analyze using SAS:
All these analyzes can easily handle by SAS. It is the best tool for accurate forecasting.
If you visualize data correctly, it is effortless for the audience to relate to it. It is essential that your tool present the data in a suitable manner. That's what SAS does for you. It has excellent presentation capabilities.
1. List reports
2. Summary reports
3. Graph reports
4. Print reports
SAS Program consists of three necessary steps:
Data step loads the needed data set into SAS memory and finds the correct variables of the data set. It also captures the records. We can use data steps to:
The syntax for DATA statement is:
DATA data_set_name; #Give a name to the dataset INPUT var1,var2,var3; #Declare variables in the dataset. NEW_VAR; #Define new variables. LABEL; #Give variables a label DATALINES; #Provide data RUN;
Following example show how to define a variable, naming the data set, creating new variables and entering the data. In this example, you can see that string variable have a $ at the end, and numeric values are without it.
INPUT ID $ NAME $ SALARY DEPARTMENT $; comm = SALARY*1.50; LABEL ID = 'Emp_ID' comm = 'COMMISION'; DATALINES; 1 Tom 5000 IT 2 Harry 6000 Operations 3 Michelle 7000 IT 4 Dick 8000 HR 5 John 9000 Finance ; RUN;
Note: To execute SAS Statement need to specify the RUN command.
It performs specific analysis or functions to produce results and reports.
PROC procedure_name options; #The name of the proc. RUN;
The given example uses the MEANS procedure to print the mean values of the numeric variables in the data set.
PROC MEANS; RUN;
You can display the data from the data with conditional output statements.
PROC PRINT DATA = data_set; OPTIONS; RUN;
Every SAS program must follow all the above mentions steps to read the input data, analyzing the data and giving the output of the analysis. The RUN statement at the end of each step finishes the execution of that step.
Below given is the complete code for each of the above steps.
Below given are some important SAS applications:
|Pharmaceutical||Statistical Analysis, Reporting|
|Telecom||ETL, Reporting, Data Mining, Forecasting|
|Financials||ETL, Reporting, Data Mining, Financial research|
|Predictive modeling||DBMarketing, Activity-Based Management|
|Healthcare||ETL, reporting, Data Mining|
|SAS is commercial software, so it requires a financial investment.||R is open source software. Hence anyone can use it.|
|SAS is an easiest analytical tool to learn. Even people with limited knowledge of SQL can learn it quickly.||R requires you to write complicated and lengthy codes.|
|SAS is a highly preferred choice by big companies and is quite technically advanced & user-friendly.||R is fast developing software; however, you need to keep upgrading it.|
|SAS has good graphical support but does not offer any customization.||Graphical support of R tool is very poor.|
Data mining is looking for hidden, valid, and all the possible useful patterns in large size data...
Data can be organized and simplified by using various techniques in Tableau. We will use the...
What is Information? Information is a set of data that is processed in a meaningful way according to...
What is Database? A database is a collection of related data which represents some elements of the...
What is ETL? ETL is an abbreviation of Extract, Transform and Load. In this process, an ETL tool...
Tableau Server is designed in a way to connect many data tiers. It can connect clients from...