Advertisement

Viser innlegg med etiketten kpi. Vis alle innlegg
Viser innlegg med etiketten kpi. Vis alle innlegg

mandag 6. juni 2016

Life Cycle of a Data Science Project

When working with big data, it is always advantageous for data scientists to follow a well-defined data science workflow. Regardless of whether a data scientist wants to perform analysis with the motive of conveying a story through data visualization or wants to build a data model- the data science workflow process matters. Having a standard workflow for data science projects ensures that the various teams within an organization are in sync, so that any further delays can be avoided.


The end goal of any data science project is to produce an effective data product. The usable results produced at the end of a data science project is referred to as a data product. A data product can be anything -a dashboard, a recommendation engine or anything that facilitates business decision-making) to solve a business problem. However, to reach the end goal of producing data products,data scientists have to follow a formalized step by step workflow process. A data product should help answer a business question. The lifecycle of data science projects should not merely focus on the process but should lay more emphasis on data products. This post outlines the standard workflow process of data science projects followed by data scientists.







Are you interested in learning how to implement the practical aspects of a data science project?


Write to: kontakt@beyondit.no, mob: 004794875183


Data science projects do not have a nice clean lifecycle with well-defined steps like software development lifecycle(SDLC). Usually, data science projects tramp into delivery delays with repeated hold-ups, as some of the steps in the lifecycle of a data science project are non-linear, highly iterative and cyclical between the data science team and various others teams in an organization. It is very difficult for the data scientists to determine in the beginning which is the best way to proceed further. Although the data science workflow process might not be clean, data scientists ought to follow a certain standard workflow to achieve the output.





If you would like more information about Data Science careers, please click the orange "Request Info" button on top of this page.


People often confuse the lifecycle of a data science project with that of a software engineering project. That should not be the case, as data science is more of science and less of engineering. There is no one-size-fits-all workflow process for all data science projects and data scientists have to determine which workflow best fits the business requirements. However, there is a standard workflow of a data science project which is based on one of the oldest and most popular-CRISP DM. It was developed for data mining projects but now is also adopted by most of the data scientists with modifications as per the requirements of the data science project.


According to a recent KDnuggets poll on – “What main methodology are you using for your analytics, data mining, or data science projects?” CRISP-DM remained the top methodology/workflow for data mining and data science projects with 43% of the projects using it.


Every step in the lifecycle of a data science project depends on various data scientist skills and data science tools. The typical lifecycle of a data science project involves jumping back and forth among various interdependent data science tasks using variety of data science programming tools. Data science process begins with asking an interesting business question that guides the overall workflow of the data science project.


CLICK HERE to get the Data Scientist Salary Report for 2016 delivered to your inbox!
Standard Lifecycle of Data Science Projects


Data science project lifecycle is similar to the CRISP-DM lifecycle that defines the following standard 6 steps for data mining projects-
Business Understanding
Data Understanding
Data Preparation
Modelling
Evaluation
Deployment


Lifecycle of data science projects is just an enhancement to the CRISP-DM workflow process with some alterations-
Data Acquisition
Data Preparation
Hypothesis and Modelling
Evaluation and Interpretation
Deployment
Operations
Optimization


1) Data Acquisition

For doing Data Science, you need data. The primary step in the lifecycle of data science projects is to first identify the person who knows what data to acquire and when to acquire based on the question to be answered. The person need not necessarily be a data scientist but anyone who knows the real difference between the various available data sets and making hard-hitting decisions about the data investment strategy of an organization – will be the right person for the job.
Data science project begins with identifying various data sources which could be –logs from webservers, social media data, data from online repositories like US Census datasets, data streamed from online sources via APIs, web scraping or data could be present in an excel or can come from any other source. Data acquisition involves acquiring data from all the identified internal and external sources that can help answer the business question.
A major challenge that data professionals often encounter in data acquisition step is tracking where each data slice comes from and whether the data slice acquired is up-to-date or not. It is important to track this information during the entire lifecycle of a data science project as data might have to be re-acquired to test other hypothesis or run any other updated experiments.


2) Data Preparation

Often referred as data cleaning or data wrangling phase. Data scientists often complain that this is the most boring and time consuming task involving identification of various data quality issues. Data acquired in the first step of a data science project is usually not in a usable format to run the required analysis and might contain missing entries, inconsistencies and semantic errors.
Having acquired the data, data scientists have to clean and reformat the data by manually editing it in the spreadsheet or by writing code. This step of the data science project lifecycle does not produce any meaningful insights. However, through regular data cleaning, data scientists can easily identify what foibles exists in the data acquisition process, what assumptions they should make and what models they can apply to produce analysis results. Data after reformatting can be converted to JSON, CSV or any other format that makes it easy to load into one of the data science tools.
Exploratory data analysis forms an integral part at this stage as summarization of the clean data can help identify outliers, anomalies and patterns that can be usable in the subsequent steps. This is the step that helps data scientists answer the question on as to what do they actually want to do with this data.
“Exploratory data analysis” is an attitude, a state of flexibility, a willingness to look for those things that we believe are not there, as well as those we believe to be there. — said John Tukey, an American Mathematician


3) Hypothesis and Modelling

This is the core activity of a data science project that requires writing, running and refining the programs to analyse and derive meaningful business insights from data. Often these programs are written in languages like Python, R, MATLAB or Perl. Diverse machine learning techniques are applied to the data to identify the machine learning model that best fits the business needs. All the contending machine learning models are trained with the training data sets.


4) Evaluation and Interpretation

There are different evaluation metrics for different performance metrics. For instance, if the machine learning model aims to predict the daily stock then the RMSE (root mean squared error) will have to be considered for evaluation. If the model aims to classify spam emails then performance metrics like average accuracy, AUC and log loss have to be considered. A common question that professionals often have when evaluating the performance of a machine learning model is that which dataset they should use to measure the performance of the machine learning model. Looking at the performance metrics on the trained dataset is helpful but is not always right because the numbers obtained might be overly optimistic as the model is already adapted to the training dataset. Machine learning model performances should be measured and compared using validation and test sets to identify the best model based on model accuracy and over-fitting.
All the above steps from 1 to 4 are iterated as data is acquired continuously and business understanding become much clearer.


5) Deployment

Machine learning models might have to be recoded before deployment because data scientists might favour Python programming language but the production environment supports Java. After this, the machine learning models are first deployed in a pre-production or test environment before actually deploying them into production.


6) Operations/Maintenance

This step involves developing a plan for monitoring and maintaining the data science project in the long run. The model performance is monitored and performance downgrade is clearly monitored in this phase. Data scientists can archive their learnings from a specific data science projects for shared learning and to speed up similar data science projects in near future.


7) Optimization

This is the final phase of any data science project that involves retraining the machine learning model in production whenever there are new data sources coming in or taking necessary steps to keep up with the performance of the machine learning model.
Having a well-defined workflow for any data science project is less frustrating for any data professional to work on. The lifecycle of a data science project mentioned above is not definitive and can be altered accordingly to improve the efficiency of a specific data science project as per the business requirements.

DeZyre’s Data Science training in Python and R programming course, helps you learn about the entire lifecycle of data science projects right from data acquisition to model evaluation.















mandag 23. mai 2016

KPI for Hospitality Business

Hospitality service in numbers

Key Performance Indicators (KPI) for Hospitality industry help remove the guesswork from managing the business by checking the numbers that tell what’s really happening.
There’s a business saying: ‘If you can’t measure it, you can’t manage it!’ Real, responsive management needs reliable and truthful figures on which decisions can be based. If there are problems, you can take corrective action quickly. If you are having success, you’ll know to do more of what you’re doing! Good figures also give you a wider understanding of your success – sometimes if it’s a quiet month (when your suppliers are telling you that ‘everyone’s quiet!’) you’ll see that some of your KPIs are actually improving (ex. sales per head).
KPIs in Hospitality industry can be categorized for functions like Reception, Housekeeping, Maintenance, Kitchen, Restaurant, Sales, Store, Purchasing, etc.
Staff KPI:
- Wage Cost %: wage costs as a percentage of sales
- Total Labour Cost %: not just wages but also the other work cover insurance, retirement and superannuation charges and other taxes that apply on your payroll
- Total Labour Hours: how many hours worked in each section. This is useful to compare against sales to measure productivity
- Event Labour charge-out: Hotels usually charge-out service staff at a markup on the cost of the wages paid. Are you achieving a consistent mark-up?
- Labour turnover: number of new staff in any one week or month
- Average length of employment: another way to look at your success in keeping staff. Add up the total number of weeks all your people have worked for you and divide this by the total number of staff
- Average hourly pay: divide the total payroll by the number of hours worked by all staff
Kitchen Management KPI:
- Food Cost %: measured by adding up food purchases for the week and measuring them against your food sales
- Total Food Costs: how much was total food bill? Sometimes a useful figure to show staff who think you are made of money
- Food Costs per head: see every week how much it costs to feed an average customer
- Kitchen Labour %: measure kitchen productivity by comparing kitchen labour against food sales
- Kitchen Labour hours: how many hours worked in this section? Compare against sales to measure productivity
- Stock value: food stock holding- It should be less than a week’s use, but can slip out if you are storing frozen food
- Main selling items: weekly sales from POS or dockets & know the best sellers and map these on the Menu Profitability
- Kitchen linen costs: cost of uniforms, aprons & tea-towels can be a shock! How many tea-towels are being used each day?
Front House Management KPI:
- Total Sales Per Head: total sales divided by number of customers. This may vary between different times of the day
- Number of customers: simple! A good measure of popularity
- Food, Dessert, Beverage Sales per head: how much your menu appeals to your customers (do you have all the choices they want), & how well your staff are selling.
- Seating Efficiency: how well are tables being turned over while still offering high quality customer service
- Basket Analysis: how many items do lunch customers buy? What else do morning coffee drinkers order? Grab a pile of dockets and look for ordering patterns
- Linen costs: uniforms, aprons etc.
- Front of House Labour %: how many hours worked in this section? Compare against sales to measure productivity
- FOH Labour hours: how many hours worked in this section? Compare against sales to measure productivity
- Customer satisfaction: Feedback forms, complaints and other methods that are hard to quantify sometimes but worth making an attempt.
- Strike rate: if 500 people came to hotel last night & only 100 ate at the bistro, your ’strike rate’ would be 1 in 5, or 20%
- RevPASH Revenue per Available Seat Hour: take the total number of ’seat hours’ and divide total revenue for a period by this number
Bar & Restaurant Management KPI:
- Sales per head: how much your beverage and wine appeals to your customers and how well your staff are selling
- Gross Profit on sales: difference between what you sold and what it cost you. The sales mix can influence this heavily
- Average Profit % on sales: useful to see if your sales are holding steady, although ultimately the actual Gross Profit (real money) will matter the most
- Stock value: It’s worth checking with your suppliers and seeing how much you can order ‘just in time’
- Stock turnover: how fast is your cellar stock selling?
- Carrying cost of stock: what is the cost of financing the stock?
- Sales / stock-take discrepancies: Alcohol is security problem, & keeping an eye on ’shrinkage’, staff drinks and stealing a constant problem
Banquet Sales Management KPI:
- Number of customers: simple! A good measure of popularity.
- Visits by your top 100 or 200 customers: they provide a huge proportion of your sales! Track their frequency and spending – these people are gold!
- Sales per head: across all areas
- Marketing and advertising costs: total value of spend, always trying to measure it against response
- Response rates: how many people responded to different campaigns and what effect did this have on profit?
- Press mentions: keeping your eyes open for favourable mentions
- Bookings: in the current week and month and coming up. Also in peak times, eg Christmas.
- Event inquiries: No. of inquiries about large bookings & functions, especially if a campaign to promote them is on
- Sales inquiry conversion rate: No. of inquiries that turn into actual sales. why so few people were ‘converted’ – was it the quality of the promotional material, skill of the sales staff, pricing or make-up of your function menus and facilities?
Finance & Admin Management KPI:
- Cash position at bank: how much do you have available after reconciling your cheque book?
- Stock-take discrepancies: measure of efficiency of each department, but also of administrative systems in place
- Total accounts due: how much do you owe?
- Total accounts payable: needs careful management if you have accounts, eg large restaurants
- Return on Investment: profit business makes can be measured as a percentage return on the amount invested in it
- Taxes owed: to know how much is owed at any one time so it is not ’spent’
- Sales & costs: actual figures compared to what budgeted for a period
- Administration labour costs: strong and skilful administrative support will be essential to manage the KPIs listed above!
- IT efficiency: how much down-time for IT systems? How accurate is the POS system?
Other KPIs:
- Revenue per available room
- Average daily rate of rooms
- % of occupancy of rooms
- Average cleaning costs per room
- % of reservation requests cancelled with / without penalty
- % of rooms with maintenance issues
- % of cancelled reservation requests
- Average number of guests per room
- Average length of stay of guests
- % of non-room revenue
- % of cancelled rooms occupied
- Kilowatt-hours (kwh) per room
- Number of hotel guests per employee
- Gross operating profits per available room
- % of guests who would rank stay as exceeding expectations
- Waste per night per occupied bed space