Total Reviews
74,896
This study analyzes Booking.com reviews of Da Nang hotels from 27 Nov 2022 to 27 Jan 2026. The pipeline maps review sentences to eight service aspects, scores sentiment at aspect level, builds review-level experience vectors, and validates cluster stability before interpretation. The final output is a ten-segment structure used to prioritize service and positioning decisions.
Review source
Booking.com guest reviews
This applied study examines heterogeneous guest experience frames in Da Nang hotels using review narratives rather than aggregated rating averages alone. The dataset contains 74,896 reviews across 458 hotels and captures the observed review stream from 27 Nov 2022 to 27 Jan 2026.
Total Reviews
74,896
Hotels Covered
458 hotels
Retained Segments
10 segments
Average Review Score
8.42 / 10
The research gap is practical and methodological: aggregate review indicators are easy to monitor but weak for diagnosis because they compress multi-attribute experiences into single scores. The study objective is to convert narrative reviews into aspect-level experience vectors, derive stable market segments, and interpret those segments as operationally useful experience frames.
Global International Arrivals (2024)
1.4B
UN Tourism reports the market returned to roughly 99% of pre-2019 levels.
International Arrivals (Q1 2025)
300M
Momentum remained positive at +5% year-over-year in early 2025.
Global Room-Night Demand (2024)
4.8B
JLL estimates demand rose by 102 million room nights from 2023.
Hotel Investment Volume (2024)
$57.3B
Global hotel transactions increased around 7% and liquidity improved.
Da Nang 2025 Visitor Plan
11.9M
Local target includes 4.8 million international overnight visitors.
Bar lengths are normalized for visual scanning and should not be read as directly proportional across unlike metrics.
In high-growth windows, service-lag penalties rise quickly. Segment playbooks should prioritize response and turnaround metrics.
Improving transaction liquidity supports selective upgrades. Target capex on pain points with both high volume and high severity.
PwC's U.S. 2026 view points to a moderate cycle (occupancy around 62%, RevPAR +0.9%), so pricing and service quality must move together.
Data acquisition followed a destination-first crawl of Booking.com Da Nang listings. The collector iterated all destination result pages until no new property appeared, then iterated each selected hotel review page until no unseen review remained. Stored fields included positive and negative review sections, review date, score, hotel URL, and collection logs with timestamp plus canonical property URL.
The preprocessing and sentence-construction workflow then parsed dates and scores, normalized hotel URLs, removed duplicates, filtered low-quality reviews (minimum five words), and split positive and negative review text into sentence units tagged by polarity source. Deduplication used a deterministic text-date-property key, then aspect assignment used cosine similarity with threshold τ = 0.35.
Polarity adjustment was applied only when section and model direction conflicted. This preserved direction while reducing intensity in structurally inconsistent sentences. Review vectors were then built as salience, valence, and missingness features for each of eight aspects before clustering.
Dedup key: translated review text + hotel URL + review date. If section polarity conflicts with model polarity (σsys,a < 0), then yadj,s,a = 0.70 × ys,a; otherwise yadj,s,a = ys,a. Review vector: xi = [saliencej, valencej, missingj]j=1...8.Segment labels were assigned after profiling dominant aspect attention and valence patterns together with review behavior indicators such as average score, narrative length, and negative-note prevalence. Names such as Minimalists, Room Inspectors, and Gastronomy Travelers therefore reflect observed evaluative frames, interpreted with market understanding and empathy for characteristic customer profiles rather than inferred personality traits.
The final analytical sample includes 74,896 reviews and 260,710 sentences after cleaning. Mean review score is 8.42. Aspect assignment produced 148,846 aspect-sentence pairs, then diagnostics were used to evaluate coverage, confidence, and robustness before interpretation.
Figure construction protocol
All result visuals were generated from versioned analytical outputs. Figure 1 uses weekly review-date aggregation with regional grouping logic. Segment-share composition uses final cluster totals. Embedding separation applies fixed UMAP settings (neighbors = 30, minimum distance = 0.1, random seed = 42) with sampled points and centroid overlays.
The valence heatmap is computed by merging experience vectors with cluster assignments, then averaging salience and valence by aspect and segment. Priority mapping is illustrative and is rendered after reported diagnostics are validated.
Saliencei,j = ni,j / |Si|. Valencei,j = (Σcs,jys,j) / (Σcs,j). Segment pain points in figure scripts use pain_score = salience_share × |negative valence|.Figure set 1
Weekly volume dynamics and observed segment share provide the empirical baseline for the study period and concentration structure.
Figure set 2
Embedding-space separation was used as a visual diagnostic that cluster structure is not entirely driven by random overlap.
Figure set 3
Aspect-level valence patterns identify where dissatisfaction concentrates by segment. The final map is illustrative and used only to communicate sequencing logic.
ABSA diagnostics indicated generally high sentiment confidence (mean 0.8636), with a low-confidence tail rate of 1.39 percent under descriptive benchmark κ = 0.6247. K-means and GMM were compared over K = 3...10, and K = 10 was retained because it balanced interpretability and internal validity while showing strong bootstrap stability (mean ARI = 0.94).
Cross-segment interpretation separates fundamentals-centered frames (for example room and cleanliness dominated) from experience-centered frames (for example ambiance, food and beverage, and service dominated). In the English-only sensitivity rerun, the overall ten-segment structure remained, with 80.6 percent label agreement and language sensitivity concentrated in the Explorers boundary.
43.7%
Persona: Efficient Essentials Guest
Short narratives with low verbosity and practical expectations. Focus stays on room basics and acceptable breakfast service without extended storytelling.
Managerial trigger: reduce check-in friction, secure room readiness, and keep service steps simple and consistent.
9.7%
Persona: Food-first Leisure Explorer
Evaluations are anchored in breakfast and dining operations. Service quality is judged through refill speed, queue flow, and perceived food hygiene cues.
Managerial trigger: protect breakfast flow, replenishment discipline, and dining-area cleanliness at peak periods.
8.3%
Persona: City and Coast Navigator
Location and mobility shape satisfaction. This group quickly reports access gaps between listing promises and real travel effort.
Managerial trigger: improve wayfinding support, access transparency, and transport guidance at arrival.
7.6%
Persona: Service Assurance Seeker
Staff interaction is the dominant evaluation channel. Satisfaction rises when requests are acknowledged fast and resolved with clear follow-through.
Managerial trigger: enforce recovery protocols for delays, missed requests, and handoff communication.
7.1%
Persona: Luxury Experience Curator
Ambiance and room-finish coherence drive value perception. Small defects can break high-end experience framing even when score remains high.
Managerial trigger: maintain finish quality, noise control, and sensory consistency across touchpoints.
6.1%
Persona: Comfort Reliability Guest
This profile rewards stable basics: clean rooms, quiet nights, and predictable comfort. Novel features matter less than consistency.
Managerial trigger: tighten housekeeping consistency and rapid correction of minor maintenance issues.
5.9%
Persona: Structured Trip Planner
Longer reviews and detailed planning logic. Facility availability, operating hours, and process reliability define trust in the stay.
Managerial trigger: publish accurate operating windows and remove uncertainty around facility uptime.
4.3%
Persona: Quality Auditor
Highly sensitive to room and cleanliness failures. Negative cues in hygiene and upkeep dominate overall judgment and suppress substitution by other strengths.
Managerial trigger: treat hygiene as non-negotiable threshold with daily audits and corrective logs.
3.9%
Persona: Value Maximizer
Compares delivered value against listing promises. Review sentiment deteriorates when fees, inclusions, or policy terms appear ambiguous.
Managerial trigger: standardize price-inclusion communication before booking and at check-in.
3.4%
Persona: Amenity-led Staycationer
Property facilities are the destination itself. Shared-space uptime, usage rules, and crowding conditions drive review outcomes.
Managerial trigger: coordinate preventive maintenance, occupancy smoothing, and pre-arrival facility status updates.
Drag the persona panel horizontally and hover cards for depth-parallax motion.
Persona framing follows common online hospitality archetype practice for business, leisure, luxury, value, and amenity-driven guests, then aligns each archetype to the observed segment signals in this dataset. Persona imagery in this panel is representative online reference photography for communication only.
Managerial interpretation is organized by operational control. Segment heterogeneity indicates that hotels should avoid one-size service programs and instead deploy differentiated actions tied to attention and valence signals in each segment.
Priority 1
For Room Inspectors and Homebodies, enforce strict room-readiness and hygiene standards. Baseline failures in these segments are weakly recoverable through other attributes.
Priority 2
For Hospitality-driven and Convenience Planners, reduce process variance across check-in, request handling, and facility operations. Consistency drives satisfaction more than isolated upgrades.
Priority 3
For Premium Travelers and Gastronomy Travelers, align package framing with ambiance, room finish, and dining quality. For Deal Hunters, communicate inclusions and fee boundaries with high precision.
Priority 4
Track segment share, negative-signal density, and aspect-level pain points at monthly cadence. Use threshold alerts to trigger corrective action at property and department level.
Recommended implementation sequence: (1) stabilize room and cleanliness operations, (2) tighten service process reliability, (3) deploy segment-specific value propositions, and (4) monitor monthly drift in segment mix and pain-point patterns.