CV

Experience and Education

Basics

Name Mengmeng Wang
Label Data Scientist, Engineer, Researcher
Url https://mengmwang.github.io/
Summary A dedicated data scientist and researcher with a multidisciplinary background in electrical engineering, biomedical engineering, and computer science. Experienced in data cleaning and analysis, machine learning, and large language models.

Work

  • 2023.02 - Present
    Data Scientist
    Centre for Youth Mental Health, Orygen
    Worked end-to-end data and machine learning pipeline including data extraction, cleaning and statistical analysis.
    • Experienced in data extraction, wrangling, and statistical analysis using structured and non-structured data
    • Applied machine learning algorithms and large language models in data analysis and health outcomes prediction
    • Contributed to developing statistical analysis plans and writing technical reports, research papers and policy briefings
    • Collaborated with clinicians, researchers, and policy makers to deliver data-driven solutions
    • Developed interactive dashboards and visualisations tools for non-technical stakeholders to support data-informed decision-making
  • 2019.03 - 2022.11
    Data Processing & Machine Learning Tutor (Casual)
    The University of Melbourne
    Delivered tutorials and practical workshops on data processing, machine learning, and signals & systems.
    • Taught key topics including data wrangling, visualisation, natural language processing, supervised and unsupervised machine learning
    • Used Python libraries including Pandas, NumPy, and Scikit-Learn in teaching modern data science and advanced machine learning concepts

Education

Skills

Programming Languages
Python
R
SQL
MATLAB
Technical Tools
GitHub
Tableau
Power BI
Excel
Python Libraries
Pandas
NumPy
Scikit-Learn
transformers
NLTK
SciPy
Matplotlib
JSON
RegEx
Professional
Data Analysis
Technical Writing
Project Management
Teamwork
Communications

Projects

  • Clinical NLP and Predictive Modelling in Medical Case Notes
    Applied advanced natural language processing techniques and large language models (LLMs) to extract, structure, and utilise insights from unstructured clinical text. This project includes text de-identification, topic clustering and outcome prediction.
    • Project 1 - Medical case note de-identification: Developed an automated, large language models (LLMs) based de-identification pipeline to identify and identify and mask personally identifiable information (PII) from clinical notes. The solution integrates external data sources (eg. location-based information) and goes beyond generic NER by incorporating real-world domain-specific knowledge.
    • Project 2 - Topic Modelling and Clustering: Implemented a BERTopic-based framework to extract latent themes and group similar clinical case notes. Identified key clinical topics and themes through unsupervised clustering.
    • Project 3 - Outcome prediction: Designed and validated models using structured features and text embeddings to predict clinical outcomes.
  • EEG Data Analysis in Music Therapy
    Analysed EEG data in music therapy research, including data cleaning, preprocessing, and statistical analysis.
    • EEG data analysis and visualisation
    • Collaboration with health professionals, doctors and music therapists
  • Financial Timeseries Processing and Forecast
    Performed financial data analysis and timeseries forecasting using machine learning models.
    • Data cleaning and preparation: outlier detection, data visualisation and feature engineering
    • Data analysis: correlation, moving-average, auto-regression analysis
    • Timeseries forecasting: auto-regression model and machine learning models (decision tree, logistic regression, neural networks)
  • Customer Purchasing Behaviors Analysis
    Analysed transaction data to find patterns in customer behavior.
    • Data pipeline: cleaning, preparation and visualisation
    • Draw Business insights based on analysis results
  • Reinforcement Learning and Multi-Armed Bandits (MABs)
    Implemented and evaluated reinforcement learning algorithms including UCB and LinUCB.
    • ϵ-Greedy, Upper Confidence Bound (UCB) models

Volunteer

  • 2021.03 - 2022.12
    Girl Power Mentor
    The University of Melbourne
    Mentored Year 11/12 female students interested in science and engineering.
    • Encouraged young women to pursue STEM education