CAPRI: Context-Aware Interpretable Point-of-Interest Recommendation Framework

Point-of-Interest (POI ) recommendation systems have gained popularity for their unique ability to suggest geographical destinations with the incorporation of contextual information such as time, location, and user-item interaction. Existing recommendation frameworks lack the contextual fusion required for POI systems. This paper presents CAPRI, a novel POI recommendation framework that effectively integrates context-aware models, such as GeoSoCa, LORE, and USG, and introduces a novel strategy for the efficient merging of contextual information. CAPRI integrates an evaluation module that expands the evaluation scope beyond accuracy to include novelty, personalization, diversity, and fairness. With an aim to establish a new industry standard for reproducible results in the realm of POI recommendation systems, we have made CAPRI openly accessible on GitHub, facilitating easy access and contribution to the continued development and refinement of this innovative framework.


INTRODUCTION
For selecting the appropriate vacation destination, restaurant, visiting locations, and so-called Point-of-Interest (POI), users must choose from a variety of possibilities.POI recommender systems can be useful tools for overcoming the inevitable information overload in many use cases.For instance, recommending hotels and other travel-related destinations remains a challenging task since trip planning entails looking for a set or list of interconnected factors (e.g., means of transportation, housing, and attractions) and where contextual factors may have a significant impact (e.g., time, location, and social environment).
Recent years have seen the development of numerous frameworks, libraries, and tools for Recommender System (RS) that make it easy for researchers to mimic the recommendation process and its influence on user preference.Utilizing frameworks for recommendation leads to the standardization of algorithm implementations and facilitates the reproducibility of experiments.Despite the advances, reproducibility remains a challenge in RS research, particularly in the areas that are not well-established, such as fairness-aware and domain-specific recommendations [19].Even minor differences in parameters and experimental settings can yield inconsistent results, making it difficult to provide definitive answers about the relative properties of different algorithms.Hence, reproducible evaluation and fair comparison of methods are demanding factors in RSs.
Despite the progress made in the field, existing frameworks for the reproducibility of RSs are typically intended to simulate generic Collaborative Filtering environments.For instance, Cornac [17] includes models leveraging auxiliary data such as item descriptive text and image.RecBole [30], an alternative comprehensive framework, introduces general, sequential, and knowledge-based recommendations.Likewise, Elliot [2] is another framework that covers a wide range of general-purpose models.
However, POI recommendation has particularities that set them apart from recommendations in other domains: ‚ The importance of context integration and fusion: The users' check-ins in POI recommendations are considerably affected by the contextual information.For instance, the geographical property of location affects the user mobility pattern or users' visit is time-depended which indicates the importance of temporal information [11].Other types of context may include social ties, the category of POIs, comments on POIs, etc.Previous research works such as [12] have shown the way incorporating these rich contexts information have a significant impact on the performance of POI recommendation models.Therefore, in recent years, there has been a growth in the demand for specialized recommendation algorithms and methodologies that can incorporate and fuse contextual information into the POI recommendation process [24]; ‚ High sparsity: The characteristics of the check-in datasets of the POI recommendation domain differ significantly from those of the other recommendation domain [3,8].Accordingly, the density of POI check-ins data is typically approximately 0.1%, whereas the density of Netflix data for movie suggestions is 1.2%.This is because a person can only visit a limited number of locations, whereas a city can contain a vast number of POIs; ‚ Necessity for multi-dimensional evaluation: Previous papers [8,18,23] in the POI field predominantly focus on accuracy-oriented metrics.However, there is a remarkable consensus in the RS community that there are other important facets to the recommendation process that accuracy metric systems cannot simply capture, such as the novelty, diversity, and catalog coverage of recommenders.Therefore, we aim to standardize multi-faceted evaluation on the accuracy, beyond-accuracy, and fairness dimensions [22].
Contributions.The work at hand addresses the above shortcoming by proposing CAPRI2 , a specialized framework for evaluating and benchmarking state-of-the-art POI recommendation models.
Different from existing open-source frameworks, such as DaisyRec [21], Elliot [2], LensKit [6], LibRec [7], LibRec-auto [20], OpenRec [25], CaseRec [4], which mainly aim to reproduce various traditional recommender systems, deep learning-based recommender systems such as DeepRec [29], and multimodal RSs like Cornac [17], CAPRI is intended to provide contextually aware recommendation and evaluation in the POI domain.We have equipped our framework with state-of-the-art models, algorithms, well-known datasets for POI recommendations, and multi-dimensional evaluation criteria (accuracy, beyond-accuracy, and user-item fairness).It also supports the reproducibility of results using various adjustable configurations for comparative experiments.
To the best of our knowledge, there is no publicly accessible framework for the reproducibility of POI models in the field of context-aware POI recommendation, despite the recent advances in the field.

RELATED FRAMEWORKS
In recent years, introducing and implementing RS frameworks and libraries gained huge attention.Sonboli et al. [20] proposed a recommendation framework titled Librec-auto for automating various aspects of offline batch RS experimentats.The framework covers a wide range of recommendation and re-ranking algorithms, along with various evaluation and fairness-aware metrics.Another framework introduced by Zhao et al. [31] covers a wide range of RS applications and contains 73 models and 28 datasets.Their framework, titled RecBole, is implemented in PyTorch and focuses on the performance of the executions, along with covering potential evaluation on the RS domain.Sun et al. [21] introduced a Python-based toolkit named DaisyRec as a benchmark for rigorous evaluation in recommendation.Their toolkit is equipped with seven well-tuned state-of-the-art algorithms and six widely-used datasets.In contrast with other existing open-source libraries, DaisyRec aims to rigorously evaluate the performance of the recommendation.Similar to CAPRI, Werneck et al. [24] introduces an additional framework for the reproducibility of POI experiment recommendations.However, their approach is not exhaustive and is not easily replicable, as it only generates the outcomes of their earlier work [23].
The majority of current frameworks are general-purpose and do not prioritize domain-specific recommendation models, such as context awareness.This characteristic makes it challenging to repurpose their research skills for domain-specific work.In contrast to the introduced frameworks, CAPRI focuses on the POI domain and aims to provide researchers with all the necessary resources.

PROPOSED FRAMEWORK
CAPRI is an open-source recommendation framework implemented in Python, suitable for practical experimentation and reproducibility study.The framework is distributed under the GPL v.3 license and can be downloaded or cloned from GitHub.In this regard, Figure 1 illustrates the general workflow of CAPRI in detail.

Files Structure
In terms of implementation, the files of the framework are organized in several directories to facilitate accessibility and extensibility.We utilize PascalCase and camelCase as basic naming structures and merge words into a single string in CAPRI for folder and file names, respectively.Detailed descriptions of the directories of the framework containing files are presented below in brief: ‚ Data: Contains data-driven files and functions of various types.
Each dataset includes files with the .txtextension that contain train, test, and tune data.Moreover, other files containing the check-ins data and relations among users/items, such as social and geographical data, are stored in folders with the same name as each dataset.There are also some data processing functions in the Data directory, including readDataSizes.py to read meta-data of the dataset, loadDatasetFiles.py to load selected dataset items, and calculateActiveUsers.py to calculate Active/Inactive users of a selected dataset for fairness-aware analysis.Current datasets of CAPRI will be discussed in Section 3.2.‚ Models: Contains the models used in the framework and several common functions in the utils.pyfile to avoid code duplication and increase the re-usability of model files.For each model, there is a folder with the same name, a main.py to control the overall processing of functions, and varying processing functions to process a selected dataset according to the selected model.The accessible models in CAPRI will be discussed in more depth in Section 3.3.‚ Evaluations: Contains all evaluation metrics available for analyzing the performance of models on datasets, the evaluator function evaluator.pythat leverages the metrics, and a unit test file test.pyfor evaluating the performance of each measure with different input types.The evaluation metrics supplied in CAPRI, as well as the evaluation process, will be discussed in detail in Sections 3.4 and 4, respectively.‚ Outputs: CAPRI stores the final findings, including ranked lists and evaluation outputs, for reproducibility purposes.The file naming structure prohibits a previously performed analysis from being reprocessed.It should be noted that due to the size of the ranked list files, we do not save them on GitHub.We welcome academics and developers who aim to contribute to the framework's enhancement.Consequently, documentation on how to contribute to the project is available at the readthedocs page of the framework3 .

Datasets
In the current version of the framework, we have provided modified versions of three popular check-ins datasets: Gowalla4 , Yelp5 [8], and Foursquare6 .The characteristics of the mentioned datasets are presented in Table 1.

Models
CAPRI covers the recent implementations of various models, which can be applied to the introduced datasets for evaluation and reproducibility goals.The models implemented in this framework are listed below: ‚ GeoSoCa: As introduced in Zhang and Chow [27], this model covers geographical, social, and categorical correlations among users and POIs.These contexts are learned using users' historical check-in data to produce relevance scores for unseen locations.‚ LORE: Another model utilized in CAPRI is LORE Zhang et al. [28], a popular and robust model for location recommendation focused on the impacts of geographical and social influence on users' check-in behaviors.
‚ USG: As introduced in Ye et al. [26], USG takes geographical influence, social network, and user interest into account for POI recommendation.
The current CAPRI version covers standard competitive contextual models for the POI domain, with users having the flexibility to modify contexts per their requirements.Our future plans include the incorporation of deep learning, graph-based, sequential, and sessions-based models as proposed in works like [1,9].These models can integrate various contextual components like geographical, temporal, social, and categorical relevance scores using fusion rules such as product or sum [8,12], forming a unified preference score [10,13,16].Contextual information, denoted by   , can be infused using a polynomial regression model  , " Λ ¨C `Λpair ¨Cpair `123  1  2  3 (1) where: in which   indicates the importance weight for the context   learned by the model.Note that the product rule ( Ä ) would have   " 0 for all  and  123 " 1.In the case of the sum (

À
),  1 ,  2 , and  3 are 1 and the rest are 0, while in the weighted sum (

Ð
), optimal values are assigned to them to maximize performance criteria.

Evaluation Dimensions
CAPRI is highly compatible with a range of evaluation metrics.Accordingly, the evaluation metrics available in the framework can be classified into the following categories: ‚ Accuracy: for accuracy evaluation, the framework covers Precision@k, Recall@k, mAP@k, and nDCG@k metrics, in which  represents the number of items filtered for recommendation.‚ Beyond-Accuracy: this category contains List Diversity, Novelty, Catalog Coverage, and Personalization metrics.‚ Fairness: it contains modules for grouping users and items according to a sensitive attribute.Thus, it includes MADr and GCE evaluation among the user/item groups [5,15,22].It is observable that the offered metrics are tailored to meet the recommendations of POI recommendation.All the evaluation metrics can be accessed using the Evaluations directory in the framework.There is also a test.pyfile in the same folder for evaluating the performance of each metric through unit testing.

Configuration
CAPRI contains a config.pyfile that provides adjustments and configurations for running various experiments.Accordingly, the parameters that can be set using the mentioned file are listed below: ‚ dataDirectory: the path from which the dataset files are read, ‚ outputsDir: the path to store final recommendation lists generated by the framework, ‚ topK: Top-k items for doing the evaluations (default: 10), ‚ limitUsers: limiting the number of users in the dataset (default: -1), ‚ listLimit: limiting the length of the final recommendation lists (default: 10), ‚ activeUsersPercentage: calculating a list of pre-defined groups of users known as "active users, " ‚ models: available models to be selected by the user, ‚ datasets: available datasets to be selected by the user, ‚ fusions: available fusions to be selected by the user, ‚ evaluationMetrics: available evaluation metrics to be selected by the user.

EVALUATION PROCESS
This section describes how the evaluation process takes place in CAPRI.All the evaluation-related functions of the framework are collected in evaluator.pyfile in the Evaluations directory.Accordingly, the model and the dataset selected by the user, as well as the evaluation and model parameters, are passed to the evaluator.Model parameters contain model-specific final scores, such as geographical, social, and categorical calculations for GeoSoCa.Evaluation parameters are formed as a dictionary that contains evaluation-related feed, such as the list of users, the list of POI, and ground truth data.By iterating over users in the dataset, overall recommendation scores and requested accuracy measures such as Precision@k and Recall@k are calculated.The final results will be saved in files for later processes.

BENCHMARKING
Table 2 shows initial and experimental results using our proposed framework, CAPRI.As one can see, using CAPRI, we are able to incorporate different contextual models of the POI recommendation domain as well as different approaches to context fusion.We can see that the fusion methods have a great impact on the performance of POI models.The sum rule could show a much better impact on the beyond-accuracy performances (i.e., coverage, novelty, and diversity) compared to the product rule.Additionally, it can be seen that the sum rule often outperforms the product operation in accuracy for the GeoSoCa model.The product operation often outperforms the sum operation in accuracy for LORE (with the exception of  where the sum is better than the product).
Finally, the weighted sum has a favorable effect on the GeoSoCa model, making it the most accurate model and enhancing the sum model under LORE.As a work where CAPRI has been employed for POI recommendation, further evaluation of the results for user/item fairness can be found in [14].

DISCUSSION AND CONCLUSION
The open-source framework, CAPRI, developed for POI recommendation systems, distinguishes itself by leveraging contextual information in the suggestion pipeline.It incorporates robust models, datasets, and evaluation metrics to provide proper location suggestions matching the context.Like any software, constant development is necessary for CAPRI to accommodate user needs.We plan on widening its coverage of datasets and models, incorporating more evaluation metrics, and investigating parallelization and requests queuing to improve performance.Future iterations will support batch request handling with user-defined parameters and offer a GUI for easier configuration.We aim to make it directly installable via Python's pip, and integrate bias mitigation approaches to reduce system biases.The framework is publicly available on GitHub for researchers.

Figure 1 :
Figure 1: General workflow of the experiments handled by CAPRI.

Table 1 :
Characteristics of the datasets available in CAPRI.
Dataset#users #POIs #check-ins #social #category the fact that CAPRI is open source and accepts contributions, it is straightforward to add new models, datasets, and metrics.

Table 2 :
Accuracy and beyond-accuracy performance of models and baselines evaluated on top-20 recommendation lists on the Yelp dataset.In context, g, t, s, and c show the geographical, temporal, social, and categorical contexts, respectively.(Bold the best metric values).