Revolutionizing Marketing Analytics: Key Advances in AI Research

Revolutionizing Marketing Analytics: Key Advances in AI Research

Key AI Research Advances For Improving Marketing Analytics & Insights

Revolutionizing Marketing Analytics: Key Advances in AI Research

Table of Contents

AI Research Advances For Improving Marketing Analytics & Insights

This research summary is part of our AI for Marketing series which covers the latest AI & machine learning approaches to 5 aspects of marketing automation:

We cover the latest research papers in applying AI to common marketing analytic tasks like customer clustering and sentiment analysis.

How well do you know your customers? Do you know what they like about your product and what they don’t like? Do they interact with your product rather functionally or emotionally? How the perception of your brand changes with time? To succeed, you need to know your customers very well. Understanding how different groups of your customers interact with your product is essential for building effective marketing campaigns.

Important Marketing Analysis Research Papers

1. Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM by Yukun Ma, Haiyun Peng, Erik Cambria

Analyzing people’s opinions and sentiments towards certain aspects is an important task of natural language understanding. In this paper, we propose a novel solution to targeted aspect-based sentiment analysis, which tackles the challenges of both aspect-based sentiment analysis and targeted sentiment analysis by exploiting commonsense knowledge. We augment the long short-term memory (LSTM) network with a hierarchical attention mechanism consisting of a target-level attention and a sentence-level attention. Commonsense knowledge of sentiment-related concepts is incorporated into the end-to-end training of a deep neural network for sentiment classification. In order to tightly integrate the commonsense knowledge into the recurrent encoder, we propose an extension of LSTM, termed Sentic LSTM. We conduct experiments on two publicly released datasets, which show that the combination of the proposed attention architecture and Sentic LSTM can outperform state-of-the-art methods in targeted aspect sentiment tasks.

Summary

The researchers claim that incorporating commonsense knowledge into a deep neural network can significantly improve targeted aspect-based sentiment analysis by directly contributing to the identification of aspects and sentiment polarity. Additionally, they suggest augmenting the LSTM network with target-level and sentence-level attention. The experiments confirm the effectiveness of the suggested approach for targeted aspect-based sentiment analysis.

Sentic LSTM

What’s the core idea of this paper?

  • The proposed neural architecture for targeted aspect-based sentiment analysis includes three key components:
    • target-level attention to learn sentiment-salient part of a target expression and generate a more accurate representation of the target (e.g., product);
    • sentence-level attention to enable search of the target- and aspect-dependent evidence over the full sentence;
    • Sentic LSTM, an extension of the LSTM cell to incorporate affective commonsense knowledge.
  • Sentic LSTM has two important roles in this architecture:
    • assisting with the filtering of information flowing from one time step to the next;
    • providing complementary information to the memory cell.

What’s the key achievement?

  • Outperforming strong baselines in such tasks as aspect categorization and aspect-based sentiment classification.
  • Demonstrating the efficacy of incorporating commonsense knowledge into the LSTM network for targeted aspect-based sentiment analysis.

What does the AI community think?

  • The paper was presented at AAAI 2018, one of the key conferences on artificial intelligence.

What are future research areas?

  • Analyzing collectively the sentiment of multiple targets co-occurring in the same sentence.
  • Investigating the role of commonsense knowledge in modeling the relation between targets.

What are possible business applications?

  • The suggested approach can improve the accuracy of sentiment analysis to provide marketers with more reliable information about the customers’ feedback on different products and different aspects of the same product.

Real "AI Buzz" | AI Updates | Blogs | Education

2. Aspect Based Sentiment Analysis with Gated Convolutional Network by Wei Xue and Tao Li

Aspect based sentiment analysis (ABSA) can provide more detailed information than general sentiment analysis, because it aims to predict the sentiment polarities of the given aspects or entities in text. We summarize previous approaches into two subtasks: aspect-category sentiment analysis (ACSA) and aspect-term sentiment analysis (ATSA). Most previous approaches employ long short-term memory and attention mechanisms to predict the sentiment polarity of the concerned targets, which are often complicated and need more training time. We propose a model based on convolutional neural networks and gating mechanisms, which is more accurate and efficient. First, the novel Gated Tanh-ReLU Units can selectively output the sentiment features according to the given aspect or entity. The architecture is much simpler than attention layer used in the existing models. Second, the computations of our model could be easily parallelized during training, because convolutional layers do not have time dependency as in LSTM layers, and gating units also work independently. The experiments on SemEval datasets demonstrate the efficiency and effectiveness of our models.

Our Summary

The authors introduce a novel, accurate and efficient approach to aspect-based sentiment analysis. They claim that the architecture based on convolutional neural networks (CNNs) and gated mechanisms, is simpler and more efficient than traditional approaches to sentiment analysis built around long short-term memory networks (LSTM) with attention mechanisms. Convolutional layers don’t have time dependency enabling parallelized computations, and thus drastically decreasing the training time. The results of experiments demonstrate the effectiveness and efficiency of the proposed approach in performing the aspect-based sentiment analysis.

What’s the core idea of this paper?

  • The research paper introduces solutions to both:
    • Aspect-Category Sentiment Analysis (ACSA) where the model is asked to predict the sentiment polarity towards a predefined aspect category (e.g., foodserviceprice).
    • Aspect-Term Sentiment Analysis (ATSA) where sentiment analysis is performed toward the aspect terms that are identified in the specific sentence (e.g., Thai food in the sentence “Average to good Thai food, but terrible delivery”).
  • The proposed approach is called Gated Convolutional Network with Aspect Embedding (GCAE), and is probably the first CNN-based solution to aspect-based sentiment analysis:
    • for ACSA task, the model includes two separate convolutional layers on the top of the embedding layer, whose outputs are combined by gating units; these gating units have two nonlinear gates, each of which is connected to one convolutional layer;
    • for ATSA task, where the aspect terms may contain several words, the model is extended with an additional convolutional layer for the target expressions.
sentiment analysis

Gated Convolutional Network with Aspect Embedding for ATSA task

What’s the key achievement?

  • GCAE outperforms several strong baselines demonstrating higher accuracy in aspect-based sentiment analysis.
  • The presented approach performs especially well on the hard test dataset, where a given sentence includes different sentiments towards different aspects.
  • In terms of the training time, the experiments confirm that GCAE is much faster than other neural models.

What does the AI community think?

  • The paper was presented at ACL 2018, one of the key research conferences on natural language processing.

What are future research areas?

  • Leveraging large-scale sentiment lexicons in neural networks.

What are possible business applications?

  • Gate Convolutional Network presented in this research paper can be a good candidate for performing aspect-based sentiment analysis in a business setting because of its:
    • high accuracy;
    • ability to recognize different sentiments towards different aspects provided within one sentence;
    • very fast training.

Where can you get implementation code?

3. Multimodal Image Captioning for Marketing Analysis by Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser, René Schallner

Original Abstract

Automatically captioning images with natural language sentences is an important research topic. State of the art models are able to produce human-like sentences. These models typically describe the depicted scene as a whole and do not target specific objects of interest or emotional relationships between these objects in the image. However, marketing companies require to describe these important attributes of a given scene. In our case, objects of interest are consumer goods, which are usually identifiable by a product logo and are associated with certain brands. From a marketing point of view, it is desirable to also evaluate the emotional context of a trademarked product, i.e., whether it appears in a positive or a negative connotation. We address the problem of finding brands in images and deriving corresponding captions by introducing a modified image captioning network. We also add a third output modality, which simultaneously produces real-valued image ratings. Our network is trained using a classification-aware loss function in order to stimulate the generation of sentences with an emphasis on words identifying the brand of a product. We evaluate our model on a dataset of images depicting interactions between humans and branded products. The introduced network improves mean class accuracy by 24.5 percent. Thanks to adding the third output modality, it also considerably improves the quality of generated captions for images depicting branded products.

Our Summary

This research paper introduces an approach to image captioning with a specific focus on marketing needs. From the marketing perspective, it is desirable that image caption targets a consumer product depicted on the image and also evaluates the emotional context of this product. The introduced neural network is trained to generate sentences with words that identify the brand of a product. Furthermore, the model produces three kinds of image rating that reflect customers interaction with a product. The experiments demonstrate that this approach provides image captions that are more accurate and also more useful for marketing purposes.

Image captioning

What’s the core idea of this paper?

  • Using a popular Show and Tell model as a basis.
  • Implementing a loss function that directly penalizes if the brand name doesn’t appear in a generated caption.
  • Extending the model with a third output modality to produce three image rating attributes:
    • whether the person interacts with the branded product in a positive (0) or negative (4) way;
    • if the person in the image is involved (0) with the branded product or uninvolved (4);
    • if there is an emotional (0) or a functional (4) interaction with the branded product.

What’s the key achievement?

  • Providing a metric to measure if an image is correctly classified with respect to objects of interest in the generated caption.
  • Showing that combining multiple tasks in one model helps to get better performance at all tasks of such a model. Thus, including three modalities in the suggested model resulted in better caption quality, brand name detection, and image ratings.

What does the AI community think?

  • The paper was presented at the 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR2018).

What are possible business applications?

  • The suggested approach enables high-scale capturing of valuable information from the social media pictures containing branded products, including:
    • how people react to and interact with a product;
    • how a brand’s popularity and perception change over time;
    • whether the customers develop emotional connections with a brand etc.

4. SpectralNet: Spectral Clustering using Deep Neural Networks by Uri Shaham, Kelly Stanton, Henry Li, Boaz Nadler, Ronen Basri, Yuval Kluger

Original Abstract

Spectral clustering is a leading and popular technique in unsupervised data analysis. Two of its major limitations are scalability and generalization of the spectral embedding (i.e., out-of-sample-extension). In this paper we introduce a deep learning approach to spectral clustering that overcomes the above shortcomings. Our network, which we call SpectralNet, learns a map that embeds input data points into the eigenspace of their associated graph Laplacian matrix and subsequently clusters them. We train SpectralNet using a procedure that involves constrained stochastic optimization. Stochastic optimization allows it to scale to large datasets, while the constraints, which are implemented using a special-purpose output layer, allow us to keep the network output orthogonal. Moreover, the map learned by SpectralNet naturally generalizes the spectral embedding to unseen data points. To further improve the quality of the clustering, we replace the standard pairwise Gaussian affinities with affinities learned from unlabeled data using a Siamese network. Additional improvement can be achieved by applying the network to code representations produced, e.g., by standard autoencoders. Our end-to-end learning procedure is fully unsupervised. In addition, we apply VC dimension theory to derive a lower bound on the size of SpectralNet. State-of-the-art clustering results are reported on the Reuters dataset. Our implementation is publicly available at https://github.com/kstant0725/SpectralNet.

Our Summary

In this research paper, the authors address two major limitations of spectral clustering – scalability and generalization. They introduce a deep neural network, called SpectralNet, that overcomes both of these issues. The problem of scalability is solved using stochastic optimization while using a neural network that can directly compute the embedding for the input data in the eigenspace solves the issue of out-of-sample extension. The experiments show the effectiveness of SpectralNet with respect to capturing non-convex clusters.

SpectralNet clustering

What’s the core idea of this paper?

  • The paper introduces SpectralNet, a deep learning approach to spectral clustering that solves scalability and generalization issues.
  • SpectralNet is trained using constrained stochastic optimization:
    • stochastic optimization enables scaling to large datasets;
    • constraints allow keeping the network output orthogonal.
  • Once trained, the model provides a function, implemented as a feed-forward network that maps each input data point to its spectral embedding coordinates, enabling the out-of-sample extension.
  • To compute Gaussian affinity, the model uses Siamese networks instead of common Euclidean distance.
  • Finally, the network is applied to transformed data obtained by an autoencoder.

What’s the key achievement?

  • Outperforming existing clustering methods when clusters cannot be contained in non-overlapping convex shapes:
    • state-of-the-art results on the Reuters document dataset;
    • competitive results on the MNIST dataset of handwritten images.

What does the AI community think?

  • The paper was presented at ICLR 2018, one of the key deep learning conferences.

What are future research areas?

  • Getting a better understanding behind Siamese networks outperforming common Euclidean distance approach.
  • Examining how stochastic gradient descent can be adapted to improve the convergence rate of SpectralNet.

What are possible business applications?

  • SpectralNet is good at capturing non-convex clusters and thus, might benefit marketing analysis with regards to clustering customers, products, images.

Where can you get implementation code?

  • The authors provide access to SpectralNet, a python library for performing spectral clustering using deep neural networks.

5. Ask less – Scale Market Research without Annoying Your Customers by Venkatesh Umaashankar and Girish Shanmugam S

Original Abstract

Market research is generally performed by surveying a representative sample of customers with questions that includes contexts such as psycho-graphics, demographics, attitude and product preferences. Survey responses are used to segment the customers into various groups that are useful for targeted marketing and communication. Reducing the number of questions asked to the customer has utility for businesses to scale the market research to a large number of customers. In this work, we model this task using Bayesian networks. We demonstrate the effectiveness of our approach using an example market segmentation of broadband customers.

Our Summary

The researchers study the problem of conducting market research for customer segmentation and propose to use Bayesian Networks to reduce the number of questions in the survey and thus, scale the research to more customers. They suggest exploiting the key advantage of Bayesian Network – its ability to handle partial information at the time of inference. The experiments in a real-world setting demonstrate that the proposed approach can help to reduce the number of questions by 50% with only a minor drop in classification performance.

market research scaling

What’s the core idea of this paper?

  • The paper introduces a novel Bayesian-based approach to scaling market research for customer segmentation.
  • The proposed approach allows to significantly reduce the number of questions in a market research survey.
  • This Bayesian-based method is implemented in two phases:
    • preparatory phase, where a company rolls out a survey questionnaire to a representative sample of customers and then learns Bayesian network for segmentation to find a minimum number of required questions;
    • scaling phase, where a company asks customers a defined number of random questions instead of going through the whole questionnaire and then assigns segment based on the results from the Bayesian Network Model.

What’s the key achievement?

  • The proposed Bayesian-based approach to scaling market research allows to significantly reduce the number of questions in a survey, and thus:
    • saves times on performing market research;
    • helps to avoid customers being annoyed with the long questionnaires.

What does the AI community think?

  • The paper was presented at the 8th International Conference on Computer Science and Information Technology (CCSIT 2018) and International Conference on Artificial Intelligence, Smart Grid and Smart City Applications (AISGSC 2019).

What are possible business applications?

  • The proposed approach to scaling market research can be directly implemented in the business setting to perform high-quality research for accurate customer segmentation and yet avoid customer irritation with long questionnaires.

6. A Deep Probabilistic Model for Customer Lifetime Value Prediction, by Xiaojing Wang, Tianqi Liu, Jingang Miao

Original Abstract

Accurate predictions of customers’ future lifetime value (LTV) given their attributes and past purchase behavior enables a more customer-centric marketing strategy. Marketers can segment customers into various buckets based on the predicted LTV and, in turn, customize marketing messages or advertising copies to serve customers in different segments better. Furthermore, LTV predictions can directly inform marketing budget allocations and improve real-time targeting and bidding of ad impressions.

One challenge of LTV modeling is that some customers never come back, and the distribution of LTV can be heavy-tailed. The commonly used mean squared error (MSE) loss does not accommodate the significant fraction of zero value LTV from one-time purchasers and can be sensitive to extremely large LTVs from top spenders. In this article, we model the distribution of LTV given associated features as a mixture of zero point mass and lognormal distribution, which we refer to as the zero-inflated lognormal (ZILN) distribution. This modeling approach allows us to capture the churn probability and account for the heavy-tailedness nature of LTV at the same time. It also yields straightforward uncertainty quantification of the point prediction. The ZILN loss can be used in both linear models and deep neural networks (DNN). For model evaluation, we recommend the normalized Gini coefficient to quantify model discrimination and decile charts to assess model calibration. Empirically, we demonstrate the predictive performance of our proposed model on two real-world public datasets.

Our Summary

In this paper, the Google research team addresses the problem of predicting customers’ future lifetime value (LTV). In particular, they want to solve the problem of the heavy-tailed distribution of LTV because of the high number of one-time purchasers and large LTVs from top spenders. To this end, they suggest modeling LTV using the zero-inflated lognormal (ZILN) distribution, which is a mix of zero-point mass and lognormal distribution, and also using a supervised regression to leverage all customer-level attributes. They also measure a model’s ability to differentiate high-value customers from low-value ones with the normalized Gini coefficient. The experiments on two real-world datasets demonstrate the effectiveness of the suggested approach.

Marketing Analytics with AI

What’s the core idea of this paper?

  • Prediction of customer lifetime value is important for a firm’s financial planning, marketing decisions, and customer relationship management.
  • When predicting the LTV of new customers, the commonly used frequency and recency characteristics cannot differentiate among customers. Thus, the authors suggest leveraging customer attributes and purchase characteristics by applying a supervised regression using a deep neural network (DNN).
  • Further, the authors point out the challenges associated with the LTV distribution, which is usually heavy-tailed and volatile due to the high number of non-returning customers and extremely large LTVs for the top spenders:
    • Mean Squared Error (MSE) is not appropriate in this case as it (a) ignores the fact that LTV labels include both zero and continuous values; (b) is highly sensitive to outliers because of the squared term.
    • The solution is to model the zero-inflated lognormal (ZILN) distribution, which handles the zero and extreme large LTVs by design.
  • The model is evaluated using the normalized Gini coefficient, which is robust to outliers and allows better business interpretation.

What’s the key achievement?

  • The experiments demonstrate that both deep neural network architecture and ZILN loss contribute to:
    • a higher Spearman’s correlation between true and predicted LTV;
    • a higher normalized Gini coefficient.

What are future research areas?

  • Exploring possible ways to further improve the predictive performance of the introduced approach by experimenting with model architecture and tuning model hyperparameters.

What are possible business applications?

  • The suggested approach to predicting customers’ lifetime value can help marketers improve their financial planning and customer relationship management.

Where can you get implementation code?

  • The implementation of the suggested approach to predicting customers’ lifetime value is available on GitHub.

7. Context-aware Embedding for Targeted Aspect-based Sentiment Analysis, by Bin Liang, Jiachen Du, Ruifeng Xu, Binyang Li, Hejiao Huang

Original Abstract

Attention-based neural models were employed to detect the different aspects and sentiment polarities of the same target in targeted aspect-based sentiment analysis (TABSA). However, existing methods do not specifically pre-train reasonable embeddings for targets and aspects in TABSA. This may result in targets or aspects having the same vector representations in different contexts and losing the context-dependent information. To address this problem, we propose a novel method to refine the embeddings of targets and aspects. Such pivotal embedding refinement utilizes a sparse coefficient vector to adjust the embeddings of target and aspect from the context. Hence the embeddings of targets and aspects can be refined from the highly correlative words instead of using context-independent or randomly initialized vectors. Experiment results on two benchmark datasets show that our approach yields the state-of-the-art performance in TABSA task.

Read More

Revolutionizing Marketing Analytics: Key Advances in AI Research

Leave a Reply

Your email address will not be published. Required fields are marked *

*