Skip to content
View shreyamalogi's full-sized avatar
:octocat:
Keep Hustling, Keep Shining!!
:octocat:
Keep Hustling, Keep Shining!!
  • CodeMacrocosm

Block or report shreyamalogi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shreyamalogi/README.md

👋 Hello! I'm SHREYA MALOGI

MSc Data Analytics @ BSBI Berlin | Data Scientist | ML Engineer
Architecting Scalable Forecasting Engine & Industrial Computer Vision


🚀 Career Objective & Immediate Availability

🎯 Actively Interviewing: Seeking Full-Time Data Science / ML Engineer roles.

📍 Location: Berlin, Germany (Hybrid/Remote) & Open to GCC Roles in Hyderabad, India.

📅 Availability: Immediate Start (April 2026).

🇩🇪 Language: German (A2 Elementary – Advancing to B1).


📊 Engineering Highlights

  • 🏗️ Big Data Architect: Engineered a 15.2M record pipeline using PySpark and GCP Dataproc.
  • 📉 Memory Optimization: Achieved 70% RAM reduction via advanced data downcasting and feature engineering.
  • 👁️ Computer Vision: Developed an industrial-grade ReUNet model achieving 91.7% Accuracy.
  • 👥 Technical Leadership: Former Founder & Lead Developer at CodeMacrocosm; scaled an open-source community to 2,100+ members and 1.2k+ Stars.

🛠️ Technical Toolbox (Ranked by Industry Demand)

Category Tools & Technologies
Machine Learning Python, Scikit-Learn, LightGBM, XGBoost, TensorFlow, OpenCV
Data Engineering PySpark, GCP (Dataproc/BigQuery), SQL (PostgreSQL), ETL Pipelines
Statistical Research Predictive Modeling, Time-Series Forecasting, Sales Analytics
Software Ops Git/GitHub, DSA (O(n) Optimization), Docker, Flask API

🏆 Featured Project: Industrial-Scale Forecasting Pipeline

Scale: 15.2 Million Transactions | Tech: PySpark, LightGBM, GCP

  • The Problem: High-latency and memory crashes during large-scale retail demand forecasting.
  • The Solution: Implemented a Tweedie-loss LightGBM model with a memory-optimized data loader.
  • The Result: 70% less memory usage and 15% higher accuracy than baseline models. View Project →

🧠 My DSA & Coding Philosophy

I write Production-Grade Python. I focus on $O(n)$ time complexity and memory-efficient data structures to ensure ML models scale seamlessly from 1k to 10M+ records.


📫 Let's Connect

LinkedIn Email

Other Tools:

aws bootstrap c cplusplus css3 dart django docker express figma firebase flask flutter git graphql heroku html5 java javascript kotlin mongodb mysql nextjs nodejs opencv php postman python react redux spring sqlite tailwind typescript

An image of @5hre9a's Holopin badges, which is a link to view their full Holopin profile

+@ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @+
@@       o o                                           @@
@@       | |                                           @@
@@      _L_L_                                          @@
@@   ❮\/__-__\/❯ Programming isn't about what you know @@
@@   ❮(|~o.o~|)❯  It's about what you can figure out   @@
@@   ❮/ \`-'/ \❯                                       @@
@@     _/`U'\_                                         @@
@@    ( .   . )     .----------------------------.     @@
@@   / /     \ \    | while( ! (succed=try() ) ) |     @@
@@   \ |  ,  | /    '----------------------------'     @@
@@    \|=====|/                                        @@
@@     |_.^._|                                         @@
@@     | |"| |                                         @@
@@     ( ) ( )   Testing leads to failure              @@
@@     |_| |_|   and failure leads to understanding    @@
@@ _.-' _j L_ '-._                                     @@
@@(___.'     '.___)                                    @@
+@ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @+

Pinned Loading

  1. Industrial-Demand-Forecasting-Pipeline Industrial-Demand-Forecasting-Pipeline Public

    Architected a high-performance predictive pipeline processing 15 Million transactions. Optimized memory by 70% via custom downcasting and implemented Tweedie-LightGBM to solve zero-inflation in ret…

    Jupyter Notebook 14

  2. Amazon-BigData-Verified-Review-Classifier Amazon-BigData-Verified-Review-Classifier Public

    Scalable Trust-Signal Detection: A Big Data pipeline using PySpark and GCP Dataproc to classify 8GB+ of Amazon reviews with high-precision Random Forest modeling. Engineered for horizontal scalabil…

    Python 3

  3. Multi-Domain-CV-Intelligence-Workspace Multi-Domain-CV-Intelligence-Workspace Public

    A high-precision Computer Vision workspace featuring ReUNet for medical segmentation and MobileNetV2 for AgTech classification. Demonstrating cross-domain AI expertise in Healthcare Diagnostics and…

    Jupyter Notebook 1

  4. Retail-Data-Engineering-Pipeline Retail-Data-Engineering-Pipeline Public

    Scalable ETL Pipeline: Processing 5M+ retail records with PySpark on GCP Dataproc. Automated the extraction of global business KPIs and consumer trends. Includes an Ethical Data Framework to ensure…

    Python 14

  5. Biometric-Attendance-Engine Biometric-Attendance-Engine Public

    Real-time face recognition system using HOG encodings and Dlib landmarks. Features a high-speed Flask/OpenCV pipeline for live video processing and automated SQL database logging

    HTML 16 1

  6. Intelligent-Travel-Recommendation-Engine Intelligent-Travel-Recommendation-Engine Public

    An Intelligent Travel Recommendation Engine using TF-IDF Vectorization and KNN to predict optimal tourist destinations. Features a modular Python/Tkinter architecture and mathematical similarity sc…

    Python 8