Taxation of Palm Oil Production Results Using on Regression Model (PT. Menthobi Makmur Sustainable)

PT Menthobi Makmur Lestari is a growing industrial company in the field of palm oil production. The company is targeting increased palm oil production to forecast capacity plans and manufacturing facilities. One of the prediction methods used is multiple linear regression. The free variables used for prediction are the age of the tree, land area (ha), number of trees, number of vines, and the bound variable is the yield of oil palm production. The results of the correlation test using multiple regression showed a significant correlation of 0.05 or less. Hypothesis testing includes multiple linear regression and correlation using the t test and f test with a significance level of α = 0.05. The value of multiple correlation analysis (R) is 0.947 and the coefficient of determination is 90%. The performance of the multiple regression equation is the accuracy of predictions with an average absolute error percentage (MAPE) of 21%, which is formed from the validation of training and testing data.


INTRODUCTION
The demand for palm oil as a vegetable oil continues to increase worldwide [1]. This is because palm oil is not only for human consumption, but also for use as fuel and as a raw material in the chemical industry [2]. This increase in global consumption has resulted in continuous land expansion [3]. There are several factors that affect the productivity of oil palm crops, namely climate, regional shape, soil conditions, planting material, and cultivation techniques [4]. That the age of the plant, the number of plant populations per hectare, the pollination system, the coordination system of harvest-processing, and so on also affect the productivity of oil palm [5]. To support these efforts, a technique is needed such as supervising and determining policies in predicting future palm oil production results [6]. Data mining contains the search for desired trends or patterns in a large database to assist in making decisions in the future [7]. One of the algorithms in data mining is Multiple Linear Regression which will be used in predicting palm oil production yields and finding patterns of relationships between factors that affect production yields for the next year [8]. Based on the discussion that has been previously presented, this study will estimate the yield of palm oil production and look for patterns of relationships between factors that affect palm oil production using the multiple linear regression method [9]. Therefore, this study raised a theme related to "the application of the multiple Linear Regression method in predicting palm oil production results" [10].
Palm oil is an important source of vegetable oil [11]. The use of palm oil has begun since the 15th century, while for marketing to Europe began in the 1800s [12]. The palm oil used comes from the flesh of the fruit (mesocarp) and from the palm kernel or kernel (endosperm) [13]. The increasing state of the world's population has an impact on the demand for CPO (crude palm oil) which is also increasing rapidly [14]. To meet these needs, several countries, especially Indonesia, are increasing oil palm production through the expansion of oil palm plantations throughout Indonesia [15].

MATERIALS AND METHODS
The following research method contains the stages of the research flowchart carried out

Predictions
Prediction is a process of systematically estimating something that is most likely to happen in the future based on past and present information owned, so that the difference between something that happened and the forecast result) can be reduced [16]. According to the big Indonesian dictionary, prediction is the result of the activity of predicting or predicting or estimating future values using past data [17]. Predictions show what will happen in a given situation and are inputs to the planning and decision-making process.

Multiple Linear Regression
In the case of multiple regression, there is one dependent variable and more than one independent variable. An analysis that has more than one free variable is called multiple Linear Regression analysis. The multiple Linear Regression technique is used to determine whether or not there is a significant influence of two or more free variables (X1, X2, X3, ..., Xk) on the bound variable (Y). Mathematically

RESULTS AND DISCUSSION
Data collection is carried out by conducting interviews and secondary data collection. The data used in this study is data on palm oil production at the Kujan Estate for the last 10 years precisely from 2009 to 2019. And there are as many as 450 records. Independent attributes, namely age, principal amount, number of bunches while dependent is the production of palm oil. Furthermore, predictions will be made to find patterns of relationships between variables for palm oil production in 2019.
Next, do the cleaning or cleaning of data. Data cleaning is done to reduce noise or confusion that can affect calculations. The initial data amounted to 460 records after the data cleaning process was carried out, then it became as many as 450 records. The following are shown some data attachments from the 450 data records presented in In this study, the dependent variable is the yield of palm oil production (Y). While the independent variables are age (X1), area (Ha) (X2), principal amount (X3), number of bunches (X4). A description of the data regarding the lowest score, highest score, mean, standard deviation (SD), and number of data (N) can be seen in Results A significant correlation between free variables and production yields is worth 0.00. Which means that each free variable has an influence on the bound variable, namely the yield of palm oil products. The results of the regression analysis can be seen in be seen that both simultaneously and partially the Independent variable has a significant effect on the Dependent variable. The next research can predict the results of palm oil production using other methods, in this study using poor datasets so that the regression prediction results have a high error rate of 21% can be seen from the prediction results in the actual data.