Back to All Research
Booking review analytics · hotel operations

Experience-Based Hotel Segmentation from Aspect-Level Sentiment

This study analyzes Booking.com reviews of Da Nang hotels from 27 Nov 2022 to 27 Jan 2026. The pipeline maps review sentences to eight service aspects, scores sentiment at aspect level, builds review-level experience vectors, and validates cluster stability before interpretation. The final output is a ten-segment structure used to prioritize service and positioning decisions.

Review source

Booking.com guest reviews

Da Nang, Vietnam Review window: 27 Nov 2022 - 27 Jan 2026 74,896 reviews 458 hotels
Section 1

Background

This applied study examines heterogeneous guest experience frames in Da Nang hotels using review narratives rather than aggregated rating averages alone. The dataset contains 74,896 reviews across 458 hotels and captures the observed review stream from 27 Nov 2022 to 27 Jan 2026.

Total Reviews

74,896

Hotels Covered

458 hotels

Retained Segments

10 segments

Average Review Score

8.42 / 10

The research gap is practical and methodological: aggregate review indicators are easy to monitor but weak for diagnosis because they compress multi-attribute experiences into single scores. The study objective is to convert narrative reviews into aspect-level experience vectors, derive stable market segments, and interpret those segments as operationally useful experience frames.

Applied Market Context · Updated February 2026

Global International Arrivals (2024)

1.4B

UN Tourism reports the market returned to roughly 99% of pre-2019 levels.

International Arrivals (Q1 2025)

300M

Momentum remained positive at +5% year-over-year in early 2025.

Global Room-Night Demand (2024)

4.8B

JLL estimates demand rose by 102 million room nights from 2023.

Hotel Investment Volume (2024)

$57.3B

Global hotel transactions increased around 7% and liquidity improved.

Da Nang 2025 Visitor Plan

11.9M

Local target includes 4.8 million international overnight visitors.

Market Signals (2024-2026)

Bar lengths are normalized for visual scanning and should not be read as directly proportional across unlike metrics.

Operational Relevance for This Study

Speed wins conversion

In high-growth windows, service-lag penalties rise quickly. Segment playbooks should prioritize response and turnaround metrics.

Capital is returning

Improving transaction liquidity supports selective upgrades. Target capex on pain points with both high volume and high severity.

Margin discipline still matters

PwC's U.S. 2026 view points to a moderate cycle (occupancy around 62%, RevPAR +0.9%), so pricing and service quality must move together.

Section 2

Method

Data acquisition followed a destination-first crawl of Booking.com Da Nang listings. The collector iterated all destination result pages until no new property appeared, then iterated each selected hotel review page until no unseen review remained. Stored fields included positive and negative review sections, review date, score, hotel URL, and collection logs with timestamp plus canonical property URL.

The preprocessing and sentence-construction workflow then parsed dates and scores, normalized hotel URLs, removed duplicates, filtered low-quality reviews (minimum five words), and split positive and negative review text into sentence units tagged by polarity source. Deduplication used a deterministic text-date-property key, then aspect assignment used cosine similarity with threshold τ = 0.35.

Polarity adjustment was applied only when section and model direction conflicted. This preserved direction while reducing intensity in structurally inconsistent sentences. Review vectors were then built as salience, valence, and missingness features for each of eight aspects before clustering.

Dedup key: translated review text + hotel URL + review date. If section polarity conflicts with model polarity (σsys,a < 0), then yadj,s,a = 0.70 × ys,a; otherwise yadj,s,a = ys,a. Review vector: xi = [saliencej, valencej, missingj]j=1...8.
Flow chart of the hotel segmentation methodology from data collection to segment profiling
Static process map of the ABSA-to-clustering pipeline used to derive and validate the segment solution.

Segment labels were assigned after profiling dominant aspect attention and valence patterns together with review behavior indicators such as average score, narrative length, and negative-note prevalence. Names such as Minimalists, Room Inspectors, and Gastronomy Travelers therefore reflect observed evaluative frames, interpreted with market understanding and empathy for characteristic customer profiles rather than inferred personality traits.

Section 3

Results

The final analytical sample includes 74,896 reviews and 260,710 sentences after cleaning. Mean review score is 8.42. Aspect assignment produced 148,846 aspect-sentence pairs, then diagnostics were used to evaluate coverage, confidence, and robustness before interpretation.

Figure construction protocol

All result visuals were generated from versioned analytical outputs. Figure 1 uses weekly review-date aggregation with regional grouping logic. Segment-share composition uses final cluster totals. Embedding separation applies fixed UMAP settings (neighbors = 30, minimum distance = 0.1, random seed = 42) with sampled points and centroid overlays.

  1. Prepare analysis tables from cleaned reviews, sentence corpus, and cluster assignments.
  2. Compute segment-level aggregates for salience, valence, and temporal share by week and month.
  3. Generate diagnostic projections and centroids from the feature matrix to evaluate separation quality.
  4. Render publication assets as static PNG and SVG outputs, then map them to responsive portfolio figure blocks.

The valence heatmap is computed by merging experience vectors with cluster assignments, then averaging salience and valence by aspect and segment. Priority mapping is illustrative and is rendered after reported diagnostics are validated.

Saliencei,j = ni,j / |Si|. Valencei,j = (Σcs,jys,j) / (Σcs,j). Segment pain points in figure scripts use pain_score = salience_share × |negative valence|.

Figure set 1

Weekly volume dynamics and observed segment share provide the empirical baseline for the study period and concentration structure.

Weekly review volume trend for the Da Nang hotel market
Figure 1. Weekly review volume by reviewer economy group over the observed period.
Treemap of observed traveler segment shares
Observed segment-share composition showing concentration in the Minimalists segment (43.7%).

Figure set 2

Embedding-space separation was used as a visual diagnostic that cluster structure is not entirely driven by random overlap.

Map showing whether hotel segments overlap or stay separate
Embedding projection used as a cluster-separation diagnostic alongside internal validity indices.

Figure set 3

Aspect-level valence patterns identify where dissatisfaction concentrates by segment. The final map is illustrative and used only to communicate sequencing logic.

Heatmap of positive and negative guest feedback by segment
Aspect valence heatmap showing concentration of positive and negative signals by segment.
Value-risk priority map used for sequencing discussion
Illustrative value-risk priority map used for communication after reported diagnostics are reviewed.

ABSA diagnostics indicated generally high sentiment confidence (mean 0.8636), with a low-confidence tail rate of 1.39 percent under descriptive benchmark κ = 0.6247. K-means and GMM were compared over K = 3...10, and K = 10 was retained because it balanced interpretability and internal validity while showing strong bootstrap stability (mean ARI = 0.94).

Cross-segment interpretation separates fundamentals-centered frames (for example room and cleanliness dominated) from experience-centered frames (for example ambiance, food and beverage, and service dominated). In the English-only sensitivity rerun, the overall ten-segment structure remained, with 80.6 percent label agreement and language sensitivity concentrated in the Explorers boundary.

Efficient essentials guest persona portrait in business attire

43.7%

Minimalists

Persona: Efficient Essentials Guest

Short narratives with low verbosity and practical expectations. Focus stays on room basics and acceptable breakfast service without extended storytelling.

Managerial trigger: reduce check-in friction, secure room readiness, and keep service steps simple and consistent.

Breakfast buffet service scene representing food-first leisure explorer persona

9.7%

Gastronomy Travelers

Persona: Food-first Leisure Explorer

Evaluations are anchored in breakfast and dining operations. Service quality is judged through refill speed, queue flow, and perceived food hygiene cues.

Managerial trigger: protect breakfast flow, replenishment discipline, and dining-area cleanliness at peak periods.

Travelers reading a city map to represent the explorer persona

8.3%

Explorers

Persona: City and Coast Navigator

Location and mobility shape satisfaction. This group quickly reports access gaps between listing promises and real travel effort.

Managerial trigger: improve wayfinding support, access transparency, and transport guidance at arrival.

Hotel receptionist portrait representing service assurance persona

7.6%

Hospitality-driven

Persona: Service Assurance Seeker

Staff interaction is the dominant evaluation channel. Satisfaction rises when requests are acknowledged fast and resolved with clear follow-through.

Managerial trigger: enforce recovery protocols for delays, missed requests, and handoff communication.

Elegant traveler portrait in a luxury hotel setting

7.1%

Premium Travelers

Persona: Luxury Experience Curator

Ambiance and room-finish coherence drive value perception. Small defects can break high-end experience framing even when score remains high.

Managerial trigger: maintain finish quality, noise control, and sensory consistency across touchpoints.

Guest relaxing under a blanket to represent comfort reliability persona

6.1%

Homebodies

Persona: Comfort Reliability Guest

This profile rewards stable basics: clean rooms, quiet nights, and predictable comfort. Novel features matter less than consistency.

Managerial trigger: tighten housekeeping consistency and rapid correction of minor maintenance issues.

Trip planning desk with map and laptop representing structured planner persona

5.9%

Convenience Planners

Persona: Structured Trip Planner

Longer reviews and detailed planning logic. Facility availability, operating hours, and process reliability define trust in the stay.

Managerial trigger: publish accurate operating windows and remove uncertainty around facility uptime.

Professional inspector with clipboard representing quality auditor persona

4.3%

Room Inspectors

Persona: Quality Auditor

Highly sensitive to room and cleanliness failures. Negative cues in hygiene and upkeep dominate overall judgment and suppress substitution by other strengths.

Managerial trigger: treat hygiene as non-negotiable threshold with daily audits and corrective logs.

Budget-minded traveler scene representing value maximizer persona

3.9%

Deal Hunters

Persona: Value Maximizer

Compares delivered value against listing promises. Review sentiment deteriorates when fees, inclusions, or policy terms appear ambiguous.

Managerial trigger: standardize price-inclusion communication before booking and at check-in.

Guests enjoying an indoor resort pool to represent staycation persona

3.4%

Staycation Guests

Persona: Amenity-led Staycationer

Property facilities are the destination itself. Shared-space uptime, usage rules, and crowding conditions drive review outcomes.

Managerial trigger: coordinate preventive maintenance, occupancy smoothing, and pre-arrival facility status updates.

Drag the persona panel horizontally and hover cards for depth-parallax motion.

Persona framing follows common online hospitality archetype practice for business, leisure, luxury, value, and amenity-driven guests, then aligns each archetype to the observed segment signals in this dataset. Persona imagery in this panel is representative online reference photography for communication only.

Section 4

Implications

Managerial interpretation is organized by operational control. Segment heterogeneity indicates that hotels should avoid one-size service programs and instead deploy differentiated actions tied to attention and valence signals in each segment.

Priority 1

Protect fundamentals before differentiation

For Room Inspectors and Homebodies, enforce strict room-readiness and hygiene standards. Baseline failures in these segments are weakly recoverable through other attributes.

Priority 2

Match service process to segment logic

For Hospitality-driven and Convenience Planners, reduce process variance across check-in, request handling, and facility operations. Consistency drives satisfaction more than isolated upgrades.

Priority 3

Convert experience segments into revenue design

For Premium Travelers and Gastronomy Travelers, align package framing with ambiance, room finish, and dining quality. For Deal Hunters, communicate inclusions and fee boundaries with high precision.

Priority 4

Institutionalize monthly review analytics

Track segment share, negative-signal density, and aspect-level pain points at monthly cadence. Use threshold alerts to trigger corrective action at property and department level.

Recommended implementation sequence: (1) stabilize room and cleanliness operations, (2) tighten service process reliability, (3) deploy segment-specific value propositions, and (4) monitor monthly drift in segment mix and pain-point patterns.