Travellers’ perspectives on historic squares and railway stations in Italian heritage cities revealed through sentiment analysis

ABSTRACT This study undertakes sentiment analysis of online reviews of public exterior spaces – historic squares and railway stations – in popular destinations in Italy, with the aim of offering new perspectives of community engagement in urban design analysis. The experience of walking through urban spaces in Italian heritage cities is evaluated under indicators of place quality and connectivity, i.e., aesthetic perception, social interaction, body mobility, facilities and amenities, sense of safety, and destination loyalty. Such advanced analysis can reshape the way we interpret the thoughts and emotions of wider communities so that these are included in local place-focused development strategies.


Introduction
Entire generations of travellers, urban planners and designers, architects, heritage practitioners, and researchers on historic landscape studies have been fascinated by a wide array of aesthetic forms and uses of exterior public space in Italian heritage cities (Bacon 1992;Canniffe 2008;Guidoni 2006, among others). The perceptual form of public space encompasses the interdependence of physical features, historical and cultural legacies, and socio-economic evidence of different civilizations over time.
Notwithstanding the importance and the complexity inherent in the preservation of the continuity, equity, and authenticity of the historic urban landscape (HUL) (Cohen 2002;Erkan 2018;Khalaf 2021), any heritage management strategy should consider the dynamic nature of the city within a holistic future-oriented approach. Environmental sustainability and more-participatory management strategies for heritage assets have been raised as issues, while limits on conceptualizing community engagement still draw controversies (Waterton and Smith 2010).
In keeping with this traveller-centric perspective, this study focuses on the quality of the experience in historic squares and railway stations in relevant Italian heritage cities. Historic squares (HSs) embody the multilayered expression of citizenship, while railway stations (RSs) often serve as main urban entrances and irreplaceable landmarks. In experiencing these places in a porous city, the traveller approaches memories shaped and overlapping across centuries (Wolfrum 2018). HSs and RSs are central nodes of the social praxis and founding elements of the physical cityscape, with distinctive urban landmarks and monumental architecture (Mazzoni 2001;Guidoni 2006;D'Agostino 2013). Their aesthetic and use values, thanks to which they were preserved, transcend time and culture. For convenience and because it would go beyond the scope of this study, the main characteristics of the selected HSs and RSs (Figure 1) are not analysed here.
While this study does not claim to represent the whole perceptual and kinetic complexity of on-site experiences, it proposes a semiautomatic method for cataloguing travellers' emotional feedback over time in selected walking environments.
This study undertakes sentiment analysis of text-based online travel reviews. Sentiment analysis is a natural language processing (NLP) technique for extracting subjective information from a massive amount of unstructured user-generated contents (UGCs) (Chen and Xie 2020). It contributes to a better understanding of the quality of the walking experience as perceived by larger and culturally diverse English-speaking travellers, regardless of their socio-democratic profiles (e.g., socio-cultural background, area of residence, income, age, or gender). Such remote analysis, which adopts and trains a context-based algorithm, makes it possible to map the emotional feedbacks of a large cluster sampling, overcoming some limitations of face-to-face surveys or other traditional data collection methods (Münster et al. 2017).
UGCs influence consumer behaviour in the pre-travel stage and other travel-related choices, as shown in the literature (Alcocer and Ruiz 2020;Joseph, Peter, and Anandkumar 2020). However, the credibility of UGCs in travel platforms and blogs has been questioned (Ayeh, Au, and Law 2013;Hassan and Elkhateeb 2021). The potential of digital social platforms or internet-enabled surveys to decode the perceived quality of public space has been less-studied so far (Münster et al. 2017). Sentiment analysis using online reviews has been mostly used in the hospitality and tourism domains to assess experiences in hotels and restaurants and in other travelling practices (Ghahramani et al. 2021).
This research is relevant in making a cultural valuation of the analysed case studies and of their unique situation in the urban flow. This novel methodological approach in urban design offers a complementary view to what may result from urban resident survey analysis and tourism statistics.

A brief look at the factors influencing traveller experience
The space where people walk makes an immediate impact on their emotional sphere (Cullen 1995;Hassan and Elkhateeb 2021). Visual and kinetic perceptions are influenced by the interplay between multiple aesthetic and use values, plus personal background, culture, and past experiences (Carmona 2014). The coexistence of diverse factors influencing the perception of the walking environment has been thoroughly discussed (e.g., Lynch 1960Lynch , 1981Jacobs and Appleyard 1987;Rapoport and Hawkes 1970;Guidoni 2006;Canniffe 2008;Mehta 2014).
In addressing space perception analysis, Benjamin's 'porosity' and 'isotropy' are fundamental concepts for approaching the urban complexity of diverse relational spaces (Wolfrum). ' [B]uilding and action interpenetrate in the courtyards, arcades, and stairways. In everything they preserve the scope to become a theatre of new, unforeseen constellations' (Benjamin 1935, 165). The contemporary city and its fragments, including HSs and RSs, are transient and connective spaces, permeable and marked by seamless flux and events, as well as by conflicting values and overlapping uses. The quality of the space depends on provisional (or seasonal) environmental conditions and personal (or affective) dimensions. People's perceptions result from the information detected through their senses, memories, and expectations (Rodaway 1994). Ewing and Handy (2009) identify a set of operational measurements of the street environment and people's behaviour (e.g., 'imageability', 'enclosure', 'human scale', 'transparency', 'complexity', 'legibility', 'linkage', 'coherence'). Carmona (2019) identifies 12 measurable elements of local environmental quality. Travel attitudes and travel motivation depend on psychological and sociodemographic factors (Jönsson and Devonish 2008). 'Destination loyalty' is another relevant indicator showing the intention of the tourist to return and to recommend a similar experience in the same place, thus depending on the traveller's overall satisfaction (Cossío-Silva, Revilla-Camacho, and Vega-Vázquez 2019).

Background
Social media has produced a huge amount of UGCs in various formats, including quantitative features (e.g., number of likes, ratings, shares), textual contents (e.g., posts, customer reviews), and images/videos. Consumers have been empowered to share their personal opinions about products and services on online review platforms (Hu and Krishen 2019). Visually-appealing platforms (e.g., TripAdvisor) attract travellers to share their experiences through gamification badges and other features. Scholars and practitioners in tourism and hospitality are devoting effort to extracting meaningful semantic evaluations from these contents, which offer the possibility of understanding polyvocal perspectives by collecting and making use of their online textual reviews.
Due to their high volume, variety, and questions of veracity, these contents cannot be analysed to their full extent manually. The automated analysis of text also faces challenges in view of its unstructured nature and the subtleties of natural language that are too complex to be dealt with by an algorithm, e.g., implicit context, metaphors, and irony (Potamias, Siolas, and Stafylopatis 2020). Text mining encompasses a set of tools and techniques designed for dealing with such challenges (Jo 2019). Among those techniques, sentiment analysis consists in processing text to compute sentiment polarity or a sentiment score (Feldman 2013;Chen and Xie 2020). A sentence can be deemed positive, negative, or neutral (zero score), depending on which words appear, their position within the sentence, and punctuation. Beyond producing a binary polarity, sentiment score provides additional information in detecting emotions in the text, also considering exclamation marks or question marks.
Surprisingly, only a handful of studies have assessed the quality of spaces through text mining, e.g., a qualitative evaluation of popular urban parks in Dublin using TripAdvisor and Foursquare reviewers (Ghahramani et al. 2021) and a qualitative evaluation of a densely populated area in Beijing using data from a Twitter-like social media platform (Gao et al. 2022). Sentiment analysis has also been conducted using raw Twitter data on 60 urban green spaces in Birmingham (Roberts, Sadler, and Chapman 2019) and has been used to understand urban and spatial practices of visitors and residents during a large sporting event, the 2012 London Olympic Games (Kovacs-Gyori et al. 2018).

Methodological approach
Selected in this study are the most-visited travel destinations among the Historic Squares (HSs) and Railway Stations (RSs) in Italian heritage cities, based on data obtained from the Italian observatory of tourism flows. The filtered reviews regard opinions posted on TripAdvisor from 2013 to 2020 (Table 1).
Based on literature and adapted to this research context, six evaluation categories are chosen to encompass cross-scale dimensions of travellers' experience. Diverse factors and sensory stimuli are associated to each evaluation category, which are then divided into 34 subcategories (topics and adjectives) shown in Table 2.
Patterns of interdependence and complementary can be outlined amongst these topics, even if grouped into different categories. To name a few, site condition and seasonality from (i) aesthetic perception are related to sense of safety and convenience from (v) tourism risk perception (or discomfort); (ii) social interaction and vibrancy is associated with (iv) facilities and amenities. Another relevant interrelation is that among the popularity of squares, the place identity and attachment, diversity and simultaneity of activity (ii) with furniture, i.e., the availability of equipment for sitting and other visitor facilities (iv), as stated in the literature (e.g., Whyte 1980).
A dictionary-term matrix is next elaborated to cover the main dimensions of the space perception analysis and to match the semantic data retrieved from TripAdvisor reviews. This dictionary includes frequently-appearing features -one-or-double terms (equivalents) -which are manually found in the raw texts and grouped to make a reliable criterion-referenced assessment possible.
An algorithm is developed to compute the sentiment score for each <review, cate-gory> pair. If a review contains any sentence matching a word within a category, then the sentiment score of that sentence is accounted for that review. If two or more sentences are found for a category in the same review, then the average sentiment score is computed. The pseudo-code for the algorithm is implemented in the R scripting language. The VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analysis is adopted for computing the sentiment score.
Following this preliminary analysis, the sentiment scores are discretized into five categories (scores: −1; −0.5; +0, 0.5; 1), considering a random sample of reviews about three top-visited destinations for expert validation.  Two tourism-related lexica are then built, and a score is associated to each word for the second refinement of outputs (Supplemental material). Values of +1 or −1 are assigned to those indicators that had direct positive or negative relevance to a cultural and inclusive experience, e.g., reference to tangible and intangible values, cost-effectiveness, pedestrian comfort. Seasonable indicators (e.g., site condition) and temporary events are scored with medium values (+0.5 or −0.5). These lexica consist of the 600 most common single terms found in the reviews, and the 300 most common double-triple terms.
The score-based approach refined by this rich, context-based dictionary makes it possible to accurately map travellers' perceptions. In fact, remote users often used specific expressions or local terms, e.g., (iv) sub. 20, that only a customized algorithm can adequately score. To name a few, many travellers just mentioned artefacts (e.g., 'what a plaza!') and related values/attributes (e.g., ´photo opportunities', 'starting place'), or attributive adjectives for travel-related activities (e.g., 'fun people watching').

Dictionary-term matrix and data visualization
Two aspects of the dictionary-term matrix are relevant. The large use of Italian words in online TripAdvisor reviews by English-speakers, especially regarding (iv) facilities and amenities (sub. 20 and sub. 22, Table 2) shows the impact of the travel experience on their emotional and cognitive sphere. On the other hand, the number of equivalents is highest for (v) sense of safety and convenience, sub.27, tourism risk perception or discomfort (156 equivalents) and about (ii) social interactions and vibrancy, sub.13, visitor activities and related items (131 equivalents). The high number of equivalents about (vi) destination loyalty, sub.27, depends on the richness of anxiety-evoking stimuli that result from individual negative perspectives and more-objectively-adverse local conditions. The high number of terms about (ii) social interaction and vibrancy, sub. 13, depends on the variety of activities in HSs and RSs and the structure of the text. The high number of equivalents about (iv) facilities and amenities, sub. 20, mobility patterns and visitor facilities (142 equivalents) results from the diversity of the public services offered in HSs and RSs and rich narratives on public transportation.
Approximately 43,000 online reviews about 10 public spaces in Italian heritage cities are analysed against 6 evaluation categories (Table 2, 1st column). For each of these categories, single topics and related adjectives (Table 2, 4th column) are grouped to integrate and ease data visualization. These data are visually organized using boxplots (groups of numerical sentiment score in quartiles) in Figures 2-4 to map anomalies (signs of skewness), dispersion (outliers plotted as individual points and discarded from the data series), and the position of the median value (indicated in bolded line). The interquartile range (IQR) represents the variability about the median, the lower quartile: Q1 -1.5*IQR and upper quartile: Q2 = Q3 + 1.5*IQR. The whiskers connect the quartiles to the minimum and maximum values. In case of heterogenous opinions of the reviewers, the IQR is displayed as a long box.

Traveller experience in selected Italian historic squares
It is presumed that online travel contributors, despite having selected heritage sites as travel destinations, are not necessarily specialized in architecture, cultural heritage, and urban design. In analysing TripAdvisor reviews, it is remarkable that they recognize and comment on aesthetic values of the visited places in detail, especially about HSs in Vatican City, Siena, and Bologna. Indeed, a relevant number of outliers of the graphs about (i) aesthetic perception (Figures 2-4)  The most frequent fluctuations about (iv) facilities and amenities occur in Piazza del Plebiscito in Naples, and about (vi) sense of convenience and safety in Turin and Naples. While the traveller feels unsafe in Piazza del Duomo in Florence, his perceptions about (iii) body mobility and comfort are positive, with an ascending trend in the last years.
In Piazza del Campo (Siena) the sentiment scores for (i), (iii), and (iv) follow the same trends over the years, except for iv) in 2019-2020. Outliers are plotted in Milan, Venice, Florence, and to a lesser extent in Catania, against (iii) and (iv). These outliers depend on the high number of reviews (in Milan and Venice), or on the low homogeneity of the reviews (in Catania).
Among the large amount of information on sentiment fluctuations shown in Table 3, it should be underlined that traveller experience in Piazza San Marco (Venice) has never suffered relevant changes over 2013-2020, with high scores yet several outliers, except during the pandemic, as expected. Most variable trends regard the (iv) (in Rome, Naples) and (v) (Turin, Pisa, Rome, Catania). Beyond the drastic reduction of risk perception, lower sentiment scores about vibrancy are detected especially in Milan and Venice.

Travellers' perceptions in Italian railway stations
Travellers usually comment about historic squares or monumental environments rather than about public infrastructures. Relevant asymmetries exist among the number of reviews about HSs and RSs per city (Table 1) and about the same type of public space in different cities, reflecting the long-standing difference of inbound tourism in Italy. Given the direct proportionality of the number of TripAdvisor reviews and the tourism trend, more data will hopefully be available in the near future, especially about travel experiences in southern regions. In fact, the curves for global trends of reviews are ascending in Naples from 2014 and in Venice from 2016, until the start of COVID-19 measures. The number of reviews is proportional to the number of arrivals per annum, apart from the Porta Nuova in Turin, which is the third busiest RS in Italy. This data analysis was discarded due to the lower number of reviews.
The low number of TripAdvisor reviews for the RSs and the brevity of textual contents therefore prevent comprehensive comparison with findings about HSs ( Figure 5). The extracted data can be discussed only regarding the busiest railway stations: those in Rome, Milan, Florence, and Bologna.

Compared analysis of Italian HSs and RSs and factors influencing travellers' dissatisfaction
In Milan, Bologna, Florence, and Rome-Vatican City, travellers' perceptions about RSs over time are worse than the experiences of HSs. For all RSs, significant skewness can be found under approximately all evaluation categories.
Contrary to how HSs are perceived considering (i) aesthetic perceptions, travellers fail to understand the relevant architectural value of the RSs, especially in Rome and Milan. In all cases, the railway station is viewed as a place to pass through, rather than a relevant exemplar of cultural infrastructure.
As a consequence of the upgrading/rehabilitation works in Stazione Centrale in Milan shortly before the World Expo in 2015, the sentiment scores have a rising trend under (i) aesthetic perception and (iv) facilities and amenities. In 2018-2019, while visitors' perceptions about (iv) stay almost constant, sentiment scores about (i) decrease. The sentiment scores about (v) sense of convenience and safety are the lowest compared to the other RSs, since this RS served as temporary refugees' shelter in 2015, continuing to be a gathering point for multi-ethnic communities, as discussed by Bini and Gambazza (2019).
Surprisingly, the evaluations against (i) related to the values of enduring (built) heritage-related -physical features and urban quality design -largely vary per year to a greater extent in RSs than in HSs. Even if aesthetic values are not directly related to mobility and safety patterns -(iii), (iv), (v) -the travellers may be influenced by negative perceptions of the whole ambience. The evaluations against (i), (ii), (iii), (iv), (v) in Stazione Termini (Rome) have values that are generally lower yet wide-ranging compared to Stazione Centrale (Milan). The evaluations for (iv) in Milan are the highest in 2015 as regards the RS and the lowest in Piazza Duomo in 2020.
Relevant year-to-year changes exist for all evaluation categories, with outliers especially in Rome (iv) and Milan (ii, iv) ( Table 3). The major and the most frequent skewness on (vi) occur in Milan, Bologna, and Rome. In Bologna, the trend of sentiment scores for (vi) rises from 2017 to 2020, while in Rome, the boxplots display longer IQR, showing high disparities in evaluations. The low sentiment scores in RS can negatively influence the perception in the HS in Bologna and Florence since these public spaces are close to each other. Feeling unsafe in RS may influence visitors' perceptions when walking around the HS. A low homogeneity of visitors' perceptions about v) is detected in this RS.
The global trends are also highly variable in the Stazione Termini, although an ascending trend occurs from 2016 to 2019. An abrupt slowdown of sentiment scores occurs for all evaluation categories from 2019 to 2020, except for v).
The travellers' perceptions considering (vi) destination loyalty are shown in Figure 6. The greatest asymmetries of sentiment scores exist in Bologna and Rome-Vatican City. Heterogeneous evaluations are displayed in Rome, as shown in large IQRs and whiskers. The experience in HS and RS in Florence is evaluated highly in comparison with the tourism experiences in the other cities.
In addition to the compared assessment considering travellers' destination loyalty, the area of residence of TripAdvisor reviewers is indicated in Table 4.
TripAdvisor contributors of reviews about HSs and RSs in Rome, Milan, Florence, and Bologna mostly live in Europe (from the British Isles and Southern Europe), and in North America (US and Canada). The number of contributors living in Oceania (especially Australia) is also relevant. The survey findings discussed in this research did not reflect the perceptions of travellers from Central Asia, Russia, or sub-Saharan Africa.
Rome-Vatican City, Venice, Milan, and Florence are the leading municipalities in Italy by number of arrivals and overnights. The concentration of international tourist flows in a few destinations is a highly debated issue (Novy and Colomb 2019). Indeed, beyond the local disruption caused by mass tourism (lower quality of life for residents, infrastructure problems, loss of the identity of the place), some negative aspects are common to all the case studies.
As a result of this analysis, Table 5 shows wide-ranging yet context-based factors influencing travellers' dissatisfaction and indicates potential measures to improve travellers' future on-site experiences.

Research limitations and strengths
The main limitations of this research regard semantic issues, the scope of analysis, the sample size of the contributors, and questions regarding user profile.
Firstly, the syntactical structure of the text is ignored in the text mining. Synonyms, polysemes, and context-dependent terms cannot be adequately scored (Lee, Song, and Kim 2010), e.g., 'green' may refer to marble surfaces or to flowerbeds; 'many people' can be intended as a positive or negative trait related to vibrancy or discomfort. Misspelled English or Italian terms -in this context, mostly referring to (i) sub.1, (ii) sub. 9, and (iv) sub. 20 -are not scored. Other sentences with ironic or implicit meanings also present difficulties in scoring. However, some of those semantic limitations have been partially overcome in this study by building ad hoc tourism-related lexica and a multistage manual validation.
Second, this method can be only efficiently applied when many UGCs are available. This study addresses the standpoints of English-speaking contributors. In most of the cases, they are one-day travellers familiar with social media. Daily usage patterns of residents and working communities are excluded in this analysis. The perspectives of domestic tourists (e.g., 49.5% in 2019) and non-English speakers (e.g., from France, 3.2%, in 2019, Eurostat datasets) are also missing. Moreover, demographic characteristics of contributors cannot be disclosed due to TripAdvisor's profile privacy and personal security statement.
Leaving aside these limitations, sentiment analysis can be used in urban design as one of the early stages of gathering data on ideas, experiences (individual activity patterns), and visions for mapping multifaceted cultural landscapes and, ultimately, to inform the design process. This remote survey is not very time-consuming, and it is reliable at a finer scale in a longer time frame. The data are freely accessible and larger compared to on-site surveys and other qualitative analysis, such as face-to-face questionnaires or workshops,   (Wang 2016, 3). In preserving the cultural value of the visited places, their competitiveness as destinations can be improved if the causes of travellers' dissatisfaction indicated in UGCs are further analysed in public heritage and city development strategies.

Conclusions
Exterior public spaces in heritage cities are gathering places, venues for socio-cultural events, and privileged travel destinations. Travellers are not simply temporary consumerviewers, but active stakeholders. Their thoughts, concerns, expectations, and emotions should be mapped and discussed to improve public and effective participation in urban planning. As stated in the literature (Münster et al. 2017;Kovacs-Gyori et al. 2018), digitally -Inform about the congested area using poster and information columns mediated process can overcome the limitations of on-site surveys or other traditional data collection methods. This optimizes resource-effort, reaches different groups of stakeholders, and extends the analysed time frame, collecting massive amount of real-time data with semi-automatic grouping.
Since virtual communities feel more comfortable sharing their private information on social media networks (Wang 2016), patterns of travellers' perceptions can be grasped in near-real time. When integrated with other collection methods able to include residents' perspectives and grounded on a comprehensive knowledge of the characteristics of the site under analysis, the use of user-generated contents (UGCs) can contribute to improve city management strategies by addressing shortcomings and decreasing asymmetries in tourism distribution.
Such a type of remote advanced analysis allows us to scrutinize travellers' cultural needs and expectations and map the ever-evolving relationship between valued places and their users in specific frames, for example during specific cultural or political events. As stated by Evans (2001, 7), 'Planning infers the planning of resources, present and future, and therefore cultural planning concerns activities, facilities and amenities that make up a society's cultural resources'.
This study has gathered research evidence on real-time on-site experiences, extending the scope of previous tourism and leisure studies based on text mining of UGCs. This research shows how English-speaking travellers perceive multi-dimensional and multiscalar features of exterior public spaces in 10 Italian heritage cities, selected for their cultural value and because they serve as gathering places for multiple communities. Although grounded in specific locations, this research shows the effectiveness of a method also applicable to other cultural realms.
The rich information of 43,000 online TripAdvisor reviews has been analysed by year considering place quality and connectivity. Strengths and asymmetries in the quality and amount of tourism flow are underlined. The post-COVID outlook of the travellers' perceptions reveals relevant fluctuations, especially about mobility and safety patterns. Besides the relevance of these results in the COVID-19 scenario, action-oriented research using UCG can foster more sustainable strategic tourism planning over time.
The analysis of semantic data from social media platforms contributes in responding to the need to democratize the decision-making process, a central issue in spatial planning. Such regrouped information can be combined with GIS (geographic information system) data and graphically represented in local constraints and possibilities maps, diagrams (Moughtin et al. 1999), or other thematic maps. Ultimately, specific mitigation measures can be drawn to improve security, aesthetic concerns, or mobility patterns by leveraging social media maps, supplemented with official information (e.g., traffic data or police reports) or framed within forward regulatory action plans, including national and local zoning regulations or strategic plans for the development of tourism. The validation phase of urban strategic plans using social databases can address situations in which the design project could be not fully synchronized with emerging needs or local constraints.

Disclosure statement
No potential conflict of interest was reported by the author(s).