Junior Data Analyst / Data Scientist

Carlos Peñalver

Turning raw data into clear insights, visualizations and predictive models.

Computer Engineering graduate focused on data analysis, SQL, machine learning and applied data projects. I enjoy working with real datasets, understanding the business problem behind them and transforming complex information into useful conclusions.

Revenue & logistics overview

SQL EDA

Model metrics

Accuracy82%
Recall76%
F1-score79%

SELECT segment, rate

FROM campaign_summary

WHERE subscribed = true;

model.fit(X_train, y_train)

About

Focused on practical data work

I am a Computer Engineering graduate currently focused on Data Analysis, Data Science and applied Machine Learning. My recent projects combine SQL, Python, exploratory data analysis, data preprocessing, visualization and predictive modeling. I am especially interested in transforming imperfect or complex datasets into clear, useful and actionable insights.

Contribution

What I can contribute

Data Analysis

  • Data cleaning and transformation
  • Exploratory data analysis
  • Pattern detection and business insights
  • Data visualization

SQL & Databases

  • SQL queries for analysis
  • Relational and dimensional modeling
  • Views, functions, scripts and indexes
  • Analytical workflows from raw data to insights

Machine Learning

  • Classification models
  • Data preprocessing
  • Feature engineering
  • Model evaluation with metrics
  • Comparison of models before and after preprocessing

Featured Projects

Projects that show the workflow

SQL EDA E-commerce / Olist

Exploratory data analysis project using an e-commerce dataset, building a complete SQL workflow from raw CSV files to a dimensional model, analytical queries and business insights.

MySQLSQLData ModelingDimensional ModelEDA
  • Loaded CSV files into raw SQL tables
  • Built a dimensional model with fact and dimension tables
  • Created analytical SQL scripts, views, functions and indexes
  • Generated insights about revenue, product categories, logistics and time trends

Bank Marketing Analysis

Analysis of a bank marketing campaign dataset to identify patterns related to customer subscription, segmentation and campaign performance.

PythonPandasMatplotlibSeabornData Analysis
  • Cleaned and explored campaign data
  • Created derived features for better segmentation
  • Visualized conversion rates and customer behavior
  • Extracted business-oriented conclusions from the analysis

Clinical Data Preprocessing

Final Degree Project focused on studying the importance of data preprocessing in clinical prediction problems, comparing machine learning models before and after applying preprocessing techniques.

PythonScikit-learnMachine LearningPreprocessingMedical Data
  • Worked with several clinical datasets
  • Compared models with and without preprocessing
  • Applied techniques such as imputation, scaling, balancing and feature selection
  • Evaluated models using Accuracy, Precision, Recall and F1-score

CSV Analytics Web App

Web application that allows users to upload CSV files and receive an initial automatic analysis, including dataset summary, detected issues, visualizations and basic insights.

PythonStreamlitPandasData Visualization
  • CSV upload and automatic data preview
  • Basic dataset profiling
  • Detection of missing values and potential issues
  • Initial visualizations and insights

Skills

Tools and techniques

Programming & Data

PythonPandasNumPyScikit-learnMatplotlibSeaborn

Databases

SQLMySQLPostgreSQLData modelingDimensional modeling

Machine Learning

Classification modelsPreprocessingFeature engineeringModel evaluationImbalanced data handling

Tools

GitGitHubJupyter NotebookVS CodeMySQL WorkbenchExcelPower BI

Contact

Interested in my profile?

You can explore my projects or contact me through LinkedIn or email.