Pedro Mota

Information Management Portfolio

Goal: Building a predictive model that produces the highest profit for the next direct marketing campaign of the company. Achieved by cherry picking the customers that are most likely to purchase the offer

Data: Groups were given a dataset resultant from a pilot campaign, containing variables related with client information, their acquisitions of company products and whether the clients accepted the pilot campaign.

Performed functions: Data access, exploration and understanding; Data preparation; Modelling; Assessment.

Technologies: SAS Enterprise Miner.

Date: May - June 2020

Dataset Variables

Variable Worth

Model Evaluation

Goal: Building a predictive model that answers the question “What people are more likely to quit their position at the company?” using the data accessible from the employee dataset provided.

Data: Groups were given a dataset containing employee information and their churn risk as “low”, “medium” or “high”.

Performed functions: Data access, exploration and understanding; Data preparation; Modelling; Assessment.

Technologies: Python, Jupyter Notebooks, NumPy, pandas, ML Algorithms.

Date: February - May 2020

Dataset Variables

Variable Worth

Model Evaluation - Best scores obtained with Random Forest (RF)

Goals: Building a predictive model to determine which clients are at risk to churn. Building a business case around possible actions to help mitigate the customer churn increase.

Data: Groups were given a dataset containing customer and service information, plus whether the client churned or not.

Performed functions: Data access, exploration and understanding; Data preparation; Modelling; Assessment; Business case definition.

Technologies: Python, Spyder, NumPy, pandas, ML Algorithms.

Date: October 2019 - January 2020

Dataset Variables

Predictive Model

Business Case

Goal: Finding potential interesting customer patterns that could provide meaningful insights about the customers and their buying habits

Data: Groups were given a dataset containing variables related with client information and their acquisitions of company products.

Performed functions: Data access, exploration and understanding; Data Clustering; Business case definition.

Technologies: Python, Jupyter Notebooks, NumPy, pandas, SOM.

Date: October 2019 - January 2020

Dataset Variables

KMeans and SOM Clusters

Marketing Campaigns

Goal: Implementing an OLAP cube and developing a varied set of reports, dashboards and dynamic analyses on top of both the cube and the data warehouse.

Data: Groups were provided with a Data Warehouse containing data from a fictional company. This data consisted in company products, employees, customer and sales information.

Performed functions: SSAS OLAP cube building; KPI and metric definition and implementation; SSRS report building; PowerBI dashboard building.

Technologies: SQL, SQL Server, SSMS, SSAS, SSRS, Power BI.

Date: February - June 2020

OLAP Cube

Total Sales per Month SSRS Report

Manufacturing Cost Analysis PowerBI Dashboard

Goal: Design, implementation and explanation of a fully-working Data Warehouse solution.

Data: Groups were given a relational database and flat files, containing data from a fictional company. This data consisted in company products, employees, customer and sales information.

Performed functions: Data access, exploration and understanding; Dimensional model design; Staging Area; ETL processes.

Technologies: SQL, SQL Server, SSMS, SSIS.

Date: October 2019 - January 2020

Database

Data Warehouse

Title: Assessing COVID-19 impact on user opinion towards videogames

Subtitle: Sentiment analysis and structural break detection on steam data

Goal: Detect whether the emotions inflicted by the pandemic and the role played by video games on entertaining individuals, changed the sentiment displayed in user reviews.

Method: User review data was collected from Steam and processed. Sentiment polarity values were extracted from english written reviews using a set of different algorithms and analysed in a timeline. Last step consisted on testing for the existence of structural breaks in the time series.

Technologies: Python, Jupyter Notebooks, R Studio.

Date: October 2020 - November 2021

Find my master thesis here!

About me

Contact

Pedro Mota

Pedro Mota

Group work: Predictive model to support direct marketing initiatives

Group work: Predictive model to understand employee churn

Group work: Predictive model to understand customer churn

Group work: Studying customer patterns and identifying segments

Group work: Building an Analysis and Reporting BI solution.

Group work: Building a Data Warehouse solution.