CLUSTER AND CONJOINT ANALYSIS FOR DETERMINING ONLINE SHOP SHOPEE CUSTOMERS PREFERENCE BASED ON E-SERVICE QUALITY

The use of e-commerce as a means of shopping is a trend that is very much in demand by many Indonesians. This makes e-service quality very important in a transaction. Customer preference is one of the main factors in shaping a business strategy. This research discusses the use of cluster analysis to segment Shopee's e-commerce customers based on sociodemographic characteristics with k-means algorithm and conjoint analysis to determine which e-service quality attributes are most important to each cluster. The sociodemographic characteristics to be analyzed are gender, education, profession, e-commerce visit, income, and last purchase from the e-commerce. The result from k-means algorithm is there are 2 groupings of customers based on their sociodemographic characteristics, cluster 1 with the majority of women members with frequent visits, while for cluster 2 with the majority of male members with frequent visits. With the result of cluster analysis, conjoint analysis help this research to find which e-service quality attributes are most important. The results are members in cluster 1 prioritize Full payment payment methods when shopping online, while members in cluster 2 prioritize star seller types when shopping online. The aspect that doesn't matter most when shopping online is fulfillment in cluster 1 and security in cluster 2.


INTRODUCTION
The existence of rapid technological advances, especially the internet, makes people active in public or private activities so that formal communication can always be supported by the internet [1]. Based on the 2019 -Q2 2020 APJII internet user survey report, the penetration of internet users in Indonesia has increased by 6.8% with a total of 196.72 million people. [2]. Internet media is one way to reach a very broad relationship, without being limited by place and time. This makes the internet media popular among business communications.
The concept of Online Shopping has become an electronic business concept that has been booming in the past few years along with the development of internet technology [3]. The ease of online shopping using a computer or smartphone makes online shopping popular with many Indonesians. With existing devices and internet networks without the need to pay the cost of leaving the house and jamming into the store, people are making online shopping a trend nowadays.
Ong dan Yap explained that consumers in Indonesia are influenced by social interactions before making purchase decisions [4]. The process allows potential buyers to visit different stores to get the best deal. If observed, certain individuals have a very strong influence on others in this process. Shopee is an application-based e-commerce platform and one of the largest platforms in Indonesia. Shopee allows anyone to register products for sale or shop with many attractive offers. According to APJII, Shopee's ecommerce ranks second with a total percentage of 27.4% of all e-commerce users [2].
Service quality is one of the most important components in a business strategy. Quality of service can influence customer judgment, trust, and commitment which makes it so important to long-term business success [5]. According to Stastika, the importance level of e-service quality at Shopee is in a very important position [6]. This makes Shopee management must be able to carry out continuous development and innovation to maintain the quality of service on this application.
Customer preference is one of the aspects for purchasing decisions [7]. Factors that influence customer preferences need to be further identified so that in the future e-service can be maximized on these factors. Conjoint analysis is one of the most often analyzed technique that people used to see customer preferences. Conjoint analysis is able to identify the relative importance of the attributes that shape customer preferences [8]. In this research, customer preferences are reviewed based on alternatives to the eservice quality variable. Because customer preferences can vary according to the characteristics of each customer, it is necessary to segment customers based on the results of the existing conjoint analysis.
Segmentation needs to be done for companies that target offerings according to customer needs and responses in order to get maximum profit [9]. In this research, customer segmentation is formed based on the customer sociodemographic character, which is then processed using cluster analysis. Cluster analysis is a method that can be used to segment each customer [10]. Cluster analysis can show significant potential for predicting customer behavior and differentiating the characteristics of each consumer. In addition, one of the algorithms that are popularly used in the business world is k-means algorithm. K-means algorithm divides the data into a number of clusters to analyzed the similarity and inequality factors in the data, which then analyzed the pattern of connectedness between the data.
The objective of this research is to use conjoint analysis to determine the preferences of e-service quality attributes Shopee-based online shop customers in West Java and segment them based on customer preferences and sociodemographic characteristics using algorithm k-means. Sociodemographic characteristics that we used for this research are gender, education, profession, e-commerce visit, income, and last purchase from the e-commerce.

RESEARCH METHODS
In this study, the steps applied to get the right results according to the research aim are: 1. Re-explain conjoint analysis and k-means cluster analysis. 2. Determine the stimulus card for conjoint analysis using the full profile image. 3. Describe the sociodemographic characteristics of customers. 4. Forming customer segments based on customer characteristics into 2 clusters. 5. Determine the relationship between customer sociodemographic for naming clusters using chi-square analysis. 6. Determine customer preferences from the cluster formed using conjoint analysis.
For data processing, this study uses SPSS 25.0 software so that the analyzed data can be more accurate.

Research Approach
This type of research is quantitative with descriptive exposure, so that the research does not test between variables. Descriptive methods are used to examine the characteristics of variables that are of interest to certain situations [11]. Descriptive research is often carried out to describe the characteristics of interesting variables from a group of humans, for example gender, age, education level, occupation, and others.

Data Collecting Procedures
In this study, the data used are primary data and secondary data. Primary data was obtained by distributing questionnaires to all Shopee e-commerce customers in West Java province. There were 83 respondents who had been prepared to fill out a questionnaire and the survey was conducted from December 2020 to January 2021. The sampling technique used in this study was purposive sampling with a quota sampling technique, sampling using certain considerations and generally adjusted to the research objectives. [12]. Determination of the sample size based on the rules using the research power (1-β) of 95%.

Conjoint Analysis
Conjoint Analysis is one of the methods of multivariate analysis using stimulus given to the respondent to provide an overall assessment in the form of a rating [13]. Conjoint analysis aims to determine a person's perception of an object. The result of conjoint analysis is the respondent's desire for an object, be it a product or a service.
Conjoint analysis process begins with problem formulation, namely identifying all attributes and their levels to create stimulus. In general, the conjoint analysis has a maximum number of seven attributes and each attribute has two to four levels. The determination of the attributes and levels, of course, is based on actionable or realizable which is written in a language that is easily understood by the respondent. The approach commonly used in the manufacture of stimulus is to use a full profile, namely multi-factor evaluation and paired combination or two-factor evaluation [13].sti After the stimulus are formed in conjoint analysis, the next step is to take a rating of each stimulus made based on the preferences of the respondent. Based on the stimulus rating, the order is obtained from the most preferred to the most disliked so that the respondent's most preferred stimulus can be found. In addition to finding the most preferred stimulus, respondents also obtained the most preferred attributes, so that based on the calculation results it can be seen which attributes the respondent likes the most. The basic model of calculating conjoint analysis is as follows [14]: with: ( ) = Total utility = I utility value attribute ( = 1, 2, 3, … , ) and level-j ( = 1, 2, 3, … , ) = Number of level attribute to-j = Number of I attribute = Dummy variable attribute to-i level to-j (1 = stage appear; 0 = stage not appear)

Cluster K-Means Analysis
K-Means analysis is a non-hierarchical clustering method in data processing to form clustering either one or more clusters that have similar characteristics in the same cluster. K-Means is based on the distance by dividing data into several clusters based on the similarity or similarity of the characteristics of each data. As for K-Means, it can only be used on numeric data [15].
K-Means analysis divides the impermeable clusters into two or more clusters based on the Euclidean Distance value, the higher the Euclidean Distance value, the higher the difference in characteristics of the compared data. As for the Euclidean Distance formula such as: Each data is compared with other data so that all data can be seen the value of Euclidean Distance. After knowing the Euclidean Distance value from each comparison between the data, it can be made into two or more clusters based on the Euclidean Distance value. Any data that has a small Euclidean Distance value, it is likely that the data will be included in the same cluster. Likewise, every data that has a large Euclidean Distance value, it is likely that the two data will fall into different clusters.

Stimulus Card Arrangement with Full Profile Image
The preparation of the stimulus card is in accordance with the combination of Shopee's e-commerce attributes which consists of 6 attributes, namely fulfillment, efficiency, security, shop type, payment, and voucher. Customers can choose from 2 alternative choices in each attribute.  Table 1 contains alternatives for each of Shopee's e-commerce e-service quality attributes which are used as the basis for the formation of the stimulus card in this study. Stimulus cards are formed using alternative combinations of each attribute with alternatives to other attributes. For simplicity, orthogonal contrast design is carried out to simplify the combination of alternatives with the full profile image technique. If the attributes and level of attributes under study are not too many, the respondent will evaluate all combinations of stimulus that arise. This approach is called factorial and all combinations can be used. However, the more attributes and degrees, this method becomes impractical. From the attributes that have been obtained by the complete combination method, a total of 64 stimulus will be obtained. With a large number of combinations, it will be difficult for customers to make an assessment. Therefore, orthogonal contrast design is carried out to simplify alternative combinations [16]. The minimum amount of stimulus that the respondent must evaluate is formulated: The alternative determination of each attribute for the stimulus card was calculated using SPSS 25.0 software and obtained 8 stimulus cards which were assessed by the respondent. The stimulus that has been reduced by the orthogonal contrast design is shown in table 2.  Table 2 above shows the stimulus cards for Shopee's e-commerce attributes. Respondents who filled out the questionnaire were asked to assess the attribute arrangement of each stimulus card starting from a value of 1 to 10. The ranking given by the respondent was then processed and analyzed using conjoint analysis with the help of SPSS 25.0 software to see the utility value that indicated the respondent's preference for each alternative on each attribute.
The utility value that has been obtained from the conjoint analysis together with the respondent's answer will be used as a grouping variable using cluster analysis. The type of cluster analysis used in this study is the k-means algorithm, and existing data will be processed and analyzed with the help of SPSS 25.0. After each segment is obtained, identification of the sociodemographic characteristics of each segment as well as the attribute and alternative utility values for each segment is carried out to name the cluster that has been formed and to describe the preferences of each cluster.

Customer Sosiodemography Characteristics
From the 83 respondents who have been obtained, a descriptive analysis was carried out for the first time in this study to show the characteristics of Shopee e-commerce customers.  Table 3 shows the sociodemographic characteristics of respondents who are customers of Shopee ecommerce in West Java province. The data will then be processed and analyzed using SPSS 25.0 software.

Conjoint Analysis Result
Conjoint analysis was carried out to obtain the utility value of each research respondent which indicated the preferences of each research respondent on the alternatives for each attribute of e-service quality e-commerce Shopee.  Table 4 shows the results of the conjoint analysis that has been carried out on all respondents. The Kendall's tau value is used to see the accuracy of predictions as seen by the correlation seen with a high and significant correlation between the estimation results and the actual results, while Kendall's tau p-value statement is if the test is smaller than 0.05 (5%), then it shows that the results of the conjoint analysis that have been carried out have good prediction accuracy [17]. From the results obtained, it can be seen that Kendall's tau value is 0.965 with a p-value of 0.001. These results indicate that the results of the conjoint analysis on the research data have good prediction accuracy.

Cluster Analysis Result
The utility value obtained through conjoint analysis together with the sociodemographic characteristics of each respondent was analysed using cluster analysis to segment customers based on their socio-demographic characteristics.  Table 4 shows the sociodemographic characteristics of the respondents in each cluster formed. From the table, it can be seen by gender, the majority of cluster 1 members are women, and cluster 2 members are mostly men. Based on education, the majority of members of cluster 1 and cluster 2 are middle high school or high school graduates, but for D3 / S1 graduates the majority are in cluster 1. Based on occupation, members of cluster 1 and cluster 2 are the majority of students. Based on the time of visit, the majority of cluster 1 members visit Shopee e-commerce more than 12 times in 1 month, and the majority of cluster 2 members visit Shopee e-commerce less than 5 times in 1 month. Based on income, the majority of cluster 1 and 2 members have an income of less than Rp. 1,000,000 a month, but for opinions above Rp. 5,000,000 a month the majority are in cluster 1. Based on the last time of purchase, members of cluster 1 and 2, the majority of respondents made purchases less than 1 week before the questionnaire was given.

Chi-Square Analysis Result
Chi-square analysis was conducted to examine the relationship between the sociodemographic characteristics of each respondent and the cluster formed. It aims to provide cluster names according to the sociodemographic characteristics of respondents who have the closest relationship to form customer segments.  Table 6 shows the results of the chi-square analysis that has been carried out. From the results of the chi-square analysis, sociodemographic characteristics that have a close relationship are age and time of visit. Therefore, the naming of cluster 1 was the majority of women with frequent visits, while for cluster 2 it was the majority of men with frequent visits.

Customer Preference
After the clusters have been identified, the preferences of each segment can be described. The description of customer preferences for each cluster contains the alternative utility values for each attribute and the importance value for each attribute of Shopee's e-commerce e-service quality. In table 8, the importance value shows the priority of the customer for each attribute of Shopee's ecommerce e-service quality. The highest importance value in cluster 1 is in the payment method, while cluster 2 is in the type of shop.
The utility value shows the tendency of customers in choosing alternatives provided by Shopee's ecommerce on each e-service quality attribute. The utility value on the fulfillment attribute shows that in cluster 1 the customer prefers delivery via COD, while in cluster 2 the customer prefers delivery via delivery services. The utility value on the efficiency attribute shows that in cluster 1 and cluster 2, customers prefer well-organized information. The value of security utility shows that in cluster 1 the customer prefers the personal information of the secure customer, while in cluster 2 the customer prefers the information on messages from the seller and secured customer. The utility value for this type of shop shows that cluster 1 and cluster 2 customers prefer star sellers to regular stores. The utility value in payment shows that in cluster 1 and cluster 2 customers prefer full payment. The utility value of the vouchers shows that in cluster 1 customers prefer cashback vouchers, while in cluster 2 customers prefer free delivery vouchers.

Discussion
The results show that in West Java province, Shopee e-commerce customers are divided into 2 clusters, namely cluster 1 which contains the majority of women with frequent e-commerce visits, and cluster 2 which contains the majority of men with frequent e-commerce visits. . The results also show that customers in cluster 1 tend to prioritize payment methods when shopping using Shopee e-commerce, and customers in cluster 2 tend to prioritize store types when shopping using Shopee e-commerce. These two results contradict Wu's research which explains that fulfillment is the most important aspect of e-commerce [18]. In cluster 1, customers tend to prioritize payment methods, according to previous research from Sari et al [19] where people tend to prefer electronic payments rather than having to go out to make payments. This can make it easier for customers in the online shopping process because they don't need to go out to make payment transactions. In cluster 2, customers tend to prioritize store types, according to previous research from Shannaz et al [20] where people see the store's reputation to get trust in buying products from a store.
On the other hand, the thing that is least emphasized in cluster 1 is fulfillment. This result contradicts Wu's research which explains that fulfillment is the most important aspect of e-commerce [18]. Cluster 1 member explained that the ease of packet tracking process that exists in the present era has made the trust of cluster 1 members formed using any shipping method. The least important thing in cluster 2 is security, which still supports the research of Shannaz et al [20] where a good store reputation supports the security of customer data, so the attribute that is more important is the type of store attribute.

CONCLUSION
Based on the results of the discussion, it can be concluded that there are 2 groupings of customers based on their socio-demographic characteristics, namely cluster 1 with the majority of women members with frequent visits, while for cluster 2 with the majority of male members with frequent visits. Members in cluster 1 prioritize Full payment payment methods when shopping online, while members in cluster 2 prioritize star seller types when shopping online. The aspect that doesn't matter most when shopping online is fulfillment in cluster 1 and security in cluster 2.