As a Data Analyst with a background in Geography and quantitative science, I have experience in precision marketing campaigns, member profile analysis, and large-scale data projects. I have developed Power BI dashboards and conducted internal data training. I enjoy working on tasks and projects that require teamwork, which always excites me when new ideas pop up during brainstorming. With a passion for digging up new information from numerous data, I am adept at conducting member life cycle analysis and utilizing member profiles and consumption behavior to inform operational management suggestions.
I enjoy watching TV series, listening to music, playing the piano, and learning Japanese in my leisure time. I also like to play volleyball on weekends with friends.
It’s really nice to meet you. You may also find me in the following contact info.
# GIS# Statistical Modeling# Spatial Data Analysis# Data Visualization# Cartography & Mapping# Web Scraping# Machine Learning
Skills
Programming Language
Python
Data analysis and visualization: e.g. PySpark, Pandas, Matplotlib, Plotly
Machine learning: e.g. Scikit-Learn, NumPy, SciPy
Web scraping: e.g. requests, BeautifulSoup, selenium
SQL
Data cleaning and processing
R
Data cleaning and processing: e.g. dplyr, plyr, tidyr, reshape2
Spatial data handling and analysis: e.g. sf, sp, spdep, rgdal, raster
(Spatial) data visualization: e.g. ggplot2, tmap, GISTools, leaflet, plotly
Statistical and machine-learning models: e.g. nlme, lme4, mgcv, gamm4, multilevel, car
Interactive dashboard: e.g. shiny, shinydashboard
HTML and CSS (front-end web design)
Software & Libraries
Business Intelligence and Dashboards: Power BI, DAX
Distributed Computing: Spark, Pyspark
Geo-related: ArcGIS Desktop and Pro, QGIS, ENVI, GeoDa
Statistics-related: SPSS, Stata, NLOGIT
Others: Gephi, Netlogo, STELLA, Microsoft Office, Git
Implemented precision marketing campaigns using data analysis:
Proposed the company's first spatial analysis-based precision marketing project, sending personalized notifications to members based on their location, achieving a 25x ROI.
Improved the efficiency of EDM campaigns by optimizing the contact list selection mechanism and reducing the sending cost by 40%, resulting in a lower number of invalid contacts.
Analyzed and utilized member profiles and consumption behavior:
Conducted member life cycle analysis to identify reasons for member churn and devised strategies to re-engage 7% of dormant members.
Led large-scale analysis projects such as annual reviews and the analysis of one million iRent members, providing operational management suggestions for site setup, vehicle scheduling, and cross-industry cooperation.
Developed Power BI dashboards and conducted internal data training:
Established marketing effectiveness statistical dashboards, simplifying manual processes and reducing the required time by 70%.
Conducted 4 Power BI education and training sessions, covering data logic, software operation methods, data updating time, and DAX syntax writing.
Research Assistant
Department of Political Science, National Taiwan University
2019/9 - 2021/8
Conducted research projects that involved building up statistical and machine-learning models (logistic regression, multilevel models, non-linear regression, clustering, network analysis) in combination with geographical information system (GIS) and spatial data analysis.
Research results publications: 2 journal articles in top international academic journals (under review).
Worked on the interim and annual reports for Ministry of Science and Technology (科技部), and was highly praised by the principal investigator.
Presented academic research results in English on large international conference (Annual Meeting of American Association of Geographers), and received positive feedback from the advisor and the audience.
Department of Geography, National Taiwan University
2017/9 - 2018/6
Designed and taught weekly lab classes under topics of statistical models and tests / spatial data analysis and visualization. (Using R and ArcGIS Desktop)
Achieved the score of 4.91/5 in TA evaluation, and received the Outstanding Teaching Assistant Award in each semester.
Head organizer of academic conference
Chinese Cartographic Association
2018/6 - 2018/10
Organized a 150-person cartographic conference as the leader of set coordinators. Achieved 50% increased on participants and 14% increase on sponsorship (compared to last year).
Jobs including planning and arranging the venue, designing the traffic flow of the manufacturer's achievement exhibition, preparing the equipment for online live broadcast, training the receptionists on the day, etc.
Research Assistant
Research Center for Environmental Changes, Academia Sinica
2015/3 - 2018/12
By applying a systematic approach, digitized a wealth of historical weather records quoted in a compendium of Chinese meteorological records of the last 3,000 years and established a spatio-temporal Chinese historical climate database.
Participated in the whole database creating process from the beginning, and understood possible problems might have while building a database.
Honours & Awards
Master thesis scholarship of Intelligent Transportation System (ITS) from Far Eastern Electronic Toll Collection 2017/10
Dean's Award of College of Science, National Taiwan University
(理學院院長獎, Gradutaion Rank top 10%)
2015/6
Presidential Awards
(書卷獎, GPA over 3.38 and Rank top 20%)
2015 spring semester (GPA 4.02/4.3 & Rank #2)
2014 fall semester (GPA 4.15/4.3 & Rank #3)
Outstanding Teaching Assistant Award of the Department
Cartography and GIS 2018 spring semester
Spatial Analysis: Methods and Applications 2017 fall semester
Education
Master of Geography (MSc)
College of Science, National Taiwan University
2016/9 - 2019/8
Thesis: Characterizing Urban Traffic Congestion Propagation Process in Different Built Environments: Using Multilevel Growth Modeling. Link to thesis
Gained master thesis scholarship of Intelligent Transportation System (ITS) from Far Eastern Electronic Toll Collection.
Highest GPA: 4.3/4.3, Ranking: 1/36.
Bachelor of Geography
College of Science, National Taiwan University
2012/9 - 2016/6
Achieved A and A+ in all courses in Statistical Modeling, Spatial Analysis and Epidemiology.
Obtained Dean's Award of College of Science (理學院院長獎, Graduation Rank top 10%)
Gained Presidential Awards × 2 (書卷獎, GPA over 3.38 and Rank top 20%)
Highest GPA: 4.15/4.3, Ranking: 2/46.
Served as the leader of the department volleyball team and designed the training contents. Won 2 champions and 1 silver medal in national contests.
Senior High School
Taipei First Girls' High School
2009/9 - 2012/6
Join Scouting and Participated in public services, e.g. the new year national flag raising ceremony in the presidential squareand, 2010 Taipei International Flora Exposition, maintained traffic flow and gived directions.
Attended the National Senior High School Jamboree in Pingtung and the Central European Jamboree in Budapest, Hungary in 2010.
Data-Driven Insights for iRent: 2022 Customer Behavior and Operational Strategies
The analysis of iRent user behavior enabled customer segmentation based on rental patterns, such as early booking for specific holidays, long-term rentals, and last-minute rentals. Additionally, vehicle usage patterns were analyzed across different time periods and regions to identify gaps between reservation orders that can cause difficulties in long-term rentals. Based on these findings, operational management recommendations were provided. Furthermore, the study identified special behaviors of specific populations during certain times, such as popular tourist destinations for different age groups during the New Year. This information was utilized for cross-industry collaboration suggestions and press releases to enhance the topic's appeal and intrigue.
Precision Marketing Strategy: Enhancing Promotion Efficiency through Location-based Notifications
To maximize promotion efficiency and increase awareness and usage of the new site, we implemented a precision marketing approach. First, we analyzed the locations of customers and vehicles during reservations and determined an acceptable travel distance. Based on this information, we recommended the nearest station to customers within that distance and sent personalized push notifications to them based on their location. By recommending the nearest station and using a precision marketing approach, we were able to achieve a 90% reduction in marketing costs and a cost-benefit ratio of up to 25 times. This approach allowed us to engage more effectively with our customers and deliver a more targeted and personalized experience.
Leveraging App Usage Data for Effective Car Rental Site Selection and Demand Forecasting
The project involved proactively collecting online data such as user behavior, time, and location within the app. Compare user actions at different time points, including booking, pick-up, and drop-off, user behavior was segmented into different categories based on their intent, such as reserving a vehicle, picking up a vehicle, returning a vehicle, or attempting to rent a vehicle but failing due to the lack of availability. By analyzing the supply and demand of vehicles and the actual conversion rates of rentals in different areas, suggestions for sites setup were presented to the operation department in the form of detailed reports every month.
Keywords:
supply and demand, user behavior, location-allocation analysis
Skills:
SQL, R, QGIS
Spatial Decision Support App for Rental House Prices
This project explores the spatial patterns of rental house prices and reveals the factors that determine the prices. The spatial analysis in this study were all done in R, and the results were wrapped up in an interactive app built with R shiny.
In Taiwan, the 591 house rental website is one of the most commonly used platforms for people to look for house rentals. When searching rental homes, a convenient location with great accessibility to public transport is often a major concern, especially in cities.
However, the 591 house rental website provides the distance between the listed houses and the closest MRT stations; however, besides MRT services, bus and bike-sharing systems are also very common public transportation in the Taipei metropolitan area including Taipei and New Taipei City. Moreover, each house searcher may have their own preferences for public transport. While some people consider bus system convenient to take them anywhere in the city, some might value travel time more and prefer the MRT to bus services since it is less likely to be affected by traffic jams.
Therefore, we designed an innovative house rental recommendation system and built up a prototype with R shiny. This system enables users to not only filter available houses by district, price and area, but input their personal preferences for the accessibility to different kinds of public transport. Our system will list customized searching results in order, and also show the location of each recommended house on an interactive map.
*This system is a team project with my two classmates in graduate schools, 廖晧宇 and 謝澤星.
Elucidating How the Red Imported Fire Ant (Solenopsis invicta) Diffused Spatiotemporally Among Different Landscapes in North Taiwan, 2008-2015
Abstract
Solenopsis invicta Buren, also known as the red imported fire ant (RIFA), has had a large negative impact on human and livestock health. However, few studies have further investigated the influence of human land-use, which is an important factor affecting the habitats of insects, on the expansion of RIFAs. In addition, there is a lack of knowledge of the empirical associations between RIFA diffusion and land-use within countries. Therefore, the objectives of this study were to provide an approach to delineate the areas of RIFA infestations and explore how land-use influences the spatiotemporal diffusion of S. invicta.
We used RIFA data from 2008 to 2015 from the RIFA surveillance system, which was conducted by the National RIFA Control Center in Taiwan. Two regions in Taiwan with different RIFA infestation levels were investigated. The
ordinary kriging method was applied to show the spatial intensity of RIFAs, the
extreme distance estimator method was applied to determine the critical diffusion distance of RIFAs,
network analyses were used to identify RIFA invasion routes between land-use types, and
bivariate local indicators of spatial association were used to capture the invasion process in time and space.
The results showed, on average, that the RIFA dispersal distance ranged from 600 to 650 m in two consecutive years in both high- and low-infestation regions. In addition, the main roads were identified as bridges that linked RIFA dispersal between rural and urban areas. Therefore, it is suggested that RIFA control activities be implemented at least 600 m from the observed spot. Additionally, control activities should be conducted on main roads linking different land-use types. Finally, restrictions should be placed on the movement of plants and soil between areas to prevent the accidental spread of RIFAs.
Modeling geographical invasions of Solenopsis invicta influenced by land-use patterns
Abstract
Research into geographical invasion of red imported fire ant (RIFA) by anthropogenic disturbances has been received large attention. Among anthropogenic disturbances, however, little are known about how land-use change and the characteristics of roads with different land-use types associated with the risk of RIFA successful invasion. Furthermore, it was often assumed that the risk of RIFA successful invasion had a liner association with the independent variables in researches. Nevertheless, a liner relationship may not reflect the actual circumstance.
RIFA data in Kinmen from 2016 to 2019 from the Taiwan RIFA surveillance system was used in this study. By applying liner and non-liner approaches, this study was to assess how land-use types, distance from the nearest road to different land-use types, and the spatial factors had effects on the risk of RIFA successful invasion. The results showed agricultural land, land for transportation usage, and land-use change from 2014 to 2017 had greater odds of RIFA successful invasion than natural land. This study also identified on land for transportation usage and the area of land-use change from 2014 to 2017, more than 60% of RIFA successful invasion occurred within 350m and 150m from the nearest loads, respectively.
These results suggested that control activities need to focus on areas of land-use change between years, especially areas within 150m from the nearest loads. Moreover, control activities need to emphasize on land for transportation usage, especially areas within 350m from the nearest loads. This study provided important insights of RIFA invasions in an isolated island and the areas of control strategies implemented.
Visualizing the population mobility change under the local outbreak in Taipei using Facebook data
The Central Epidemic Command Center (CECC) had announced the nationwide Level 3 alert for COVID-19 since May 19, 2021. In response to the COVID-19 pandemic, lots of companies decided to let the employees work from home or take turns coming to the office to reduce crowding. Using about 6.8 million records of user data from Facebook and land-use data in Taiwan, this project analyzed and visualized the population mobility change under the local outbreak in Taipei from May to June. The results showed that after Level 3 alert, the total mobility didn’t change so much. However, there’s a clear difference between long- and short-distance movements. Long-distance movements decreased while short-distance movements increased after Level 3 alert, especially in residential districts.
Data cleaning, processing, and visualizing were all done with R programming.
Keywords:
GIS, Spatial Data, Spatial Analysis, Data Visualizing, Mobility, COVID-19
Characterizing Urban Traffic Congestion Propagation Process in Different Built Environments: Using Multilevel Growth Modeling
Abstract
Traffic congestion propagation is a big issue in road transportation systems. It contains not only the congestion condition under one certain time but also the formation, transmission, propagation speed, affecting areas and other factors of congestion. Understanding these characteristics can help people manage and reduce traffic congestion. Numerous researches have studied how built environment affect traffic congestion, but most research focused on the congestion in a specific time period, like weekday, weekend, off peak and peak. However, it is not clear that how built environment at different times may affect congestion propagation. Limited by computer computing efficiency, the few researches on congestion propagation had been mainly based on microsimulations of link-level dynamics, lack of research in large urban networks.
This study proposed a new empirical method to analyze traffic congestion propagation. Using quadratic growth model, the relationship between congestion propagation and built environment was analyzed. A data set of vehicle detector (VD) data of 2017/1/10 (Tue.) and 1/21 (Sat.) 12:00-19:00 from Bureau of Traffic Engineering, Taipei City Government was used to access traffic conditions. A
multilevel growth model was utilized to analyze the
spatial-temporal traffic data. In order to deal with the dependency among the data, each VD was firstly divided into different clusters by
Max-P clustering. Then the ratio of the congested VDs within each cluster was used as the dependent variable, which represented the level of congestion. Independent variables included the built environment factors in a 500-meter radius of each cluster, which could be divided into traffic-related factors and land use, and time variable. The results used a function of time and figures to show the over-time relationship among congestion and the build environment factors.
Revealing the underlying consumer behaviors using daily vegetable wholesale prices and trading volumes
This project explored the patterns of daily vegetable wholesale prices, and revealed the factors that determine the prices and the underlying consumer behaviors. 2.5 million records of daily vegetable wholesale prices and trading volumes from Council of Agriculture in 2011-2017 were derived using web scraping. Historical climate data included daily temperature, precipitation and typhoon were also used for analyzing.
In this study, we tried to answer the following questions:
Would the vegetable wholesale prices and trading volumes make differences between farm gates and other places?
Is it reasonable to bulk buying vegetables before typhoon?
Web scraping, data cleaning, processing, and visualizing were all done with R programming.
*This system is a team project with my two classmates in graduate schools, 洪嘉鴻 and 王崧阡.
Comparing the term frequency of the speeches of President Tsai in 2016 and 2017 by text mining
This project analyzed the term frequency of the speeches of President Tsai (in Chinese) in 2016 and 2017 with text mining and natural language processing. The raw text of speeches was first tokenized into small chunks using the package “jiebaR” in R language. Then stop words, the most common words like “a” or “the”, were removed to ensure the result would be meaningful. Bar charts and word clouds were used to compare the most frequently used words in different years. The speech of President Tsai in 2016 mainly emphasized the new government’s taking office, the ambitions and expectations for the future, etc. She used words such as "we", "country", "society" and "Taiwan" frequently to emphasize the unity of the people. On the other hand, the wording used in 2017 was not so positive compared with 2016. There were many more words used to review the governance of the past year such as "not enough", "improper", "difficult", etc.
Data cleaning, processing, and visualizing were all done with R programming.
Web scraping can help people extract an enormous amount of data from websites automatically. This project scraped different websites using Python. Some websites were used APIs to access the data directly without parsing HTML, e.g., 368 townships weather forecast in the next 7 days. Static websites, e.g., Jin Yong novels, line stickers, were scraped with the packages
“requests" and
“BeautifulSoup”, and HTML response could be parsed directly. As for dynamic websites like 104 job bank, YouTube, Facebook, Instagram, interpretations of constitutional court,
Selenium were used to automate web browsers in order to execute the JavaScript code received from servers.
Keywords:
web scraping, web crawling, API, regular expression, CSS selector