Posts Tagged ‘text mining’

EFP Brief No. 212: Tech Mining

Tuesday, May 1st, 2012

The main purpose of the exercise is the development of new methods to discover patterns that new technologies follow and the opportunities they offer for innovation. This brief attempts to foster a new understanding of the mechanisms generating innovations. It presents a methodology to identify future technology opportunities based on text mining of scientific and technological databases. Assisting priority or agenda setting, the method could be useful for technology managers and corporate decision-makers in planning and allocating R&D resources.

New Methods to Anticipate Opportunities around Technologies

The analysis of new technologies has been of interest for many years. The increase in disruptive innovations and scientific research in recent years is driving institutions and also companies to develop methodologies for identifying technologies of the future. However, it is necessary to develop methods suitable for discovering the patterns according to which these technologies are likely to evolve. This will make it possible to convert them into opportunities for innovation as an essential prerequisite for maintaining competitiveness in the long-term.

Scientific and specifically patents databases are generally regarded as precursors of future or ongoing technological developments. Therefore, the analysis of such databases should enable identifying certain technology gaps that potentially could be transformed into opportunities.

Against this background, the project “How to anticipate opportunities around technologies” moves towards understanding the mechanisms generating innovations.

This exercise was designed and launched in light of the need to foster and accelerate scientific and technological innovation. Scientific publications and patent records are analysed as the empirical basis of the study. Experts are then asked to comment on the results of the analysis. The methodology applied to monitor new technologies uses the tech-mining approach and a combination of quantitative analysis and expert knowledge.

We will demonstrate how this instrument allows anticipating opportunities around technologies drawing on examples from two different industrial sectors. The methodology has been developed working with data from two different technological fields in order to compare and validate results. The two technology fields are waste recycling and “non-woven” textiles and their applications.

The project is running from 2010 until the end of 2012. The application to the waste recycling sector is financed through the SAIOTEK programme of the Basque Ministry of Industry, Trade and Tourism.

Quantitative Databases and Qualitative Knowledge

The exercise deals with the identification of opportunities based on scientific articles and patent information, using quantitative methods to process the information and expert knowledge for assessing it. The main goal is to identify the most important factors influencing the development of a new technology and to understand the mechanisms generating innovation.

The project team is comprised of researchers from the Industrial Engineering and Management Departments of the two technical universities University of the Basque Country and The University of Valencia and the R&D centre TECNALIA. The collaborating R&D centre has been granted the right to make first use of this research.  

Tapping into the Scientific Knowledge Base

The exercise is divided into two phases. In Phase I, the technologies were defined in order to analyse the scientific knowledge in the respective technology field and outline the technology landscape using the knowledge contained in articles and patents databases. We applied the tech-mining approach in the first step, then used a cross-correlation matrix and finally performed principal component analyses (PCA). This resulted in visualisations of the technology sectors where it is possible to determine gaps around technologies. Figure 1 shows the characteristics of the scientific information analysed for the waste recycling sector.

Assessing Emerging Technologies

In Phase II, we will use qualitative techniques in order to assess the potential for the emerging technology gaps found. These interim results will be discussed with the experts (“bottom up”) to identify potential opportunities. The R&D centre will contribute upon request. They will play a key role particularly in identifying opportunities in the last phase. Previous works in this field were considered as well (see references).

The Tech-mining Methodology

The foresight method developed in this analysis is innovative because it combines qualitative knowledge and quantitative data allowing the conclusions from the individual analysis to converge into a variety of industrial scenarios. Figure 2 shows an outline of the methodology. It retrieves and downloads the information on these two sectors using the Derwent Patents and Environmental Abstracts databases. The downloaded information is analysed using text mining techniques.

In recent years, text mining has been an expanding area. The introduction of natural language techniques that use semantic algorithms combined with the most advanced statistical techniques, such as multivariate analysis or cluster analysis, have become powerful tools for discovering and visualising the knowledge contained in scientific literature.

Identifying Innovative Investment Opportunities

Phase I of the project has been completed; the major socio-economic trends have been identified and the results disseminated as a paper to the international community exemplifying the analysis for the waste recycling sector. At this point in the project, the main findings, for instance on new technologies in waste recycling, can already be utilised by innovative companies.

One of the analyses was to determine the year in which the descriptor appeared for the first time (see Figure 3). The results allowed us to assess the new terms, such as “detritivores” or “allelopathy” in 2009, which belong to the biotechnological field. These terms, which we call weak signals, only appear once or twice.

Biotechnological terms surfaced as we mined titles and terms in abstract in databases for 2010. These particular trends are also recognisable within the International Patent Classification IPCs for this period.

We are working on creating multiple technological maps. For example, there have been several analyses of the patent applications downloaded from the Derwent database. Figure 3 shows a result obtained after the cross-correlation of the individuals (patents) in a two dimensional space according to similarity of the International Patent Classification limited to four digits, ergo according to their technological contents. IPC is used to assign them to a similar technology group. Then we used the maps to identify patent clusters and areas where patents are lacking. The green ellipses drawn in Figure 3 represent the gaps where there are no patents.

In a further step, we screened and investigated the patents adjacent to each gap to determine the meaning of the patent gaps. The objective was to analyse the emergence of each gap and evaluate certain indicators that we expected to tell us whether the gap represents a technologically valuable area or not.

Qualitative indicators were defined such that the density of the gap measures the average number of claim items of adjacent patents and the half-life of the patents in the vicinity of the gap while allowing to evaluate the documents on patents on the gap borders in terms of how they relate to the most up-to-date keywords.

In order to establish a methodology to analyse the emerging technologies, we determined the year when the keywords, i.e. the descriptor, appeared for first time, as mentioned above. It is possible to classify these keywords into two types: keywords of emerging or declining frequency. By comparing, we can contrast the number of keywords by years between the different gaps. In essence, this procedure allowed us to measure emerging technologies through the keywords found in the patents surrounding the gap.

In the field of non-wovens, the tech-mining methodology allowed us to identify several emerging technology trends, among others the increasing use of nanotechnologies in the patented inventions.

During Phase II, we will validate the methodology. An advance in research requires the participation of experts in the field of waste recycling and non-woven textiles who can assess the articles in terms of newly found references. The opinion of the experts about the potential impact of newly identified technologies will allow us to determine the most innovative areas of work.

Bio- and Nanotechnology Innovations for Waste Recycling and Non-woven Sectors

The main contribution of this study to research policy is that it provides a methodology to identify new and emerging technologies leading to innovations. An institutional policy encouraging the tendencies identified should be able to increase regional competitiveness.

Our analyses support decision-making through understanding how innovations are generated, enabling decision-makers to anticipate and address the challenges identified and the emerging weak signals. Furthermore, once the project is completed, we will have applied our method to two practical cases from the waste recycling and non-woven sectors. With these examples, we want to demonstrate how the methodology suggested can be applied to anticipate opportunities.

The method could be particularly useful for technology managers and corporate decision-makers in order to plan and allocate R&D resources. Governments and regional development agencies could also use it to improve innovation policies in terms of planning and decision-making.

However, in many cases, new technologies are a necessary but not a sufficient condition for successful innovations. A wide range of non-technical factors are relevant as well (demand, regulations etc.). For successful implementation, it will be necessary to identify the innovation pathways.

We believe that in a context of increasing uncertainty and financial constraints, these results show that foresight methodologies such as tech-mining offer a positive return on investment for policy and decision-makers.

Authors: Rosa Mª Rio-Belver1

Ernesto Cilleruelo2

Fernando Palop3

Sponsors: Departamento de Industria, Innovación, comercio y turismo – Basque Government – Programa SAIOTEK
Type: Sectoral forward looking analysis
Organizer: 1University of the Basque Country UPV/EHU, C/ Nieves Cano 12, SP-01006 Vitoria-Gasteiz, Spain

2University of the Basque Country UPV/EHU, Almed. Urquijo s/n, SP-48030 Bilbao, Spain

3Universidad Politécnica de Valencia, Camino de Vera s/n, SP-46022 Valencia, Spain

Duration: 2010-2011 Budget: 45,000 € Time Horizon: 2012 Date of Brief: March 2011  


Download EFP Brief No. 212_Tech_Mining

Sources and References

Cozzens, S.; Gatchair, S.; Kang, J.; Kim, K.; Lee, H.J. ; Ordoñez, G.; Porter, A. (2010): Emerging Technologies: quantitative identification and measurement. Technology Analysis & Strategic Management 22 (3): 361-376.

Belver, R.; Carrasco, E. (2007) Tools for strategic business decisions: Technology maps. The 4th International Scientific Conference “Business and Management.Vilnius, Lithuania 5-6 October. Selected Papers. Vilnius Gediminas Technical University Publishing House “Technika”, 2007, 299-303.

Huang, L.; Porter, A.; Guo, Y. (2009): Exploring a Systematic Technology Forecasting Approach for New & Emerging Sciences & Technologies: A Case Study of Nano-enhanced Biosensors, in Proceedings of the Atlanta Conference on Science and Innovation Policy. Georgia Tech University, Atlanta, USA, 2–3 October.

Lee, S.; Yoon, B.; Park, Y. (2009): An Approach to Discovering New Technology Opportunities: Keyword-based Patent Map Approach. Technovation 29: 481–497. doi:10.1016/j.technovation.2008.10.006

Porter, A.; Newman, N. (2011): Mining external R&D. Technovation 31 (4): 171-176, doi: 10.1016/j.technovation.2011.01.001

Porter, A.; Kongthon, A.; Chyi, L. (2002): Research Profiling: Improving the Literature Review. Scientometrics 53 (3): 351–370. doi:10.1023/A:1014873029258

Rio, R.; Cilleruelo, E. (2010): Discovering technologies using techmining: the case of waste recycling. The 6th International Scientific Conference “Business and Management 2010. Vilnius, Lithuania 13-14 May. Selected Papers. Vilnius Gediminas Technical University Publishing House “Technika”, Vilnius, 2010, 950-955.. doi:10.3846/bm.2010.127

Rio, R.; Larrañaga, J.; Elizagarate, F. (2008): Patentalava. Dynamics of Innovation Strategies and their Relationship with the Evolution of Patents. The Alava province case, in The 5th International Scientific Conference “Business and Management”. Vilnius, Lithuania, 5–6 October. Selected papers. Vilnius: Technika, 475–480.

Yun, Y.; Akers, L.; Klose, T.; Barcelon, C. (2008): Text Mining and Visualization Tools – Impressions of Emerging Capabilities, World Patent Information 30: 280–293. doi:10.1016/j.wpi.2008.01.007

Zhu, D.; Porter, A. L. (2002): Automated Extraction and Visualization of Information for Technological Intelligence and Forecasting, Technological Forecasting and Social Change 69: 495–506. doi:10.1016/S0040-1625(01)00157-3