Statistics of the Internet: what is going on in Europe?
Vilnius University, Faculty of Economics, Department of Marketing
Sauletekio al. 9, 2 build, 819 room
Tel. 370 5 2366146
Abstract. The paper is aiming to analyse both theoretic and empiric aspects of the Internet research focusing on the social (and information) aspects of it. Paper starts with presentation of the state of art of the social research on the Internet emphasising that the statistics of the Web is part of it. Silver (2000) defines three stages of the Web social research development. The current stage (third) of Internet research, the so-called critical cyberculture studies, started in the late 1990s and it has been the most mature studies of the social aspects of the Internet.
Second chapter of the paper explains the system of the Internet information, taking the findings of IDC as the basis. There are four main sources of the information on Internet usage and commerce: macro-level statistics, demand-side research, supply-side research and access technology data. Chapter three analyses the methodological problems of Internet surveys. Currently not the lack of data on the Internet is the problem, but rather the accessibility, dissemination and particularly the methodology. The distortions in the Internet information, reflecting the same indicators, can happen because of the differences in sources of the survey population, in samples, in questionnaires, in quality and management of the surveys. The criss-cross examination can be used to define the reliability of the indicators. A chapter four presents briefly the policies of the Internet development in Europe, emphasising the e-Europe project. Chapters five and six analyses the empiric data of the Internet development in Europe. For this, data of the Eurostat is used. Internet penetration among the individuals of EU15 overreached recently 50%. However, the differences among the EU countries are significant as well as between the gender groups and by level of education. On the other hand, Internet penetration among EU enterprises reached the saturation and is nearby 100%. E-commerce is better penetrated among the individuals of EU then the enterprises. 19% of EU15 individuals were used to buy on the Web in 2003, and only 12% of enterprises did so. Paper ends with the discussion on what are the drivers of Internet development. It is concluded that currently we must consider not only the traditional factors of Internet development, but the power of the Internet self-development also.
1. Introduction. Social research on the Internet: state of the art
In spite of its relatively short history, the Internet has already been established as one of the most important channels for communication, business and marketing. As soon as it started to be available to the public world in 1990s, it became a subject of research1. The first scientific research on the Internet appeared a decade ago and during that decade various aspects of the Web were investigated. Scientists from various fields - economy, communication, social science, technology, anthropology, ethnography, design, etc. made attempts to apply knowledge from their field to the Internet research. Taking the social aspects of the Internet research into consideration, which comprise also the Internet statistics, Silver (2000) defines 3 stages of development and discusses them within the framework of research on cyberculture2. The first stage is the popular Internet cyberculture research, which came into exit at the beginning of 1990s. Internet at this stage has been little concerned with serious sciences. Scientists, who contributed to this stage (Birkerts, 1994, Kling 1996, Sale 1995) have been warning the society about the Internet threats to social fragmentation, escapisms, lack of human communication, etc. The second stage of cyberculture studies focused largely on virtual communities and networks, and benefited from an influx of academic scholars. Rheingold (1993) contributed mostly to this research by publishing The Virtual Community. However, research at this stage contained a larger degree of super-enthusiasms. The third stage of Internet research, the so-called critical cyberculture studies, started in the late 1990s and it has been the most mature studies of the social aspects of the Internet. At this stage scholars have been taking a broader and more rational view of what constitutes the cyberculture. According to Silver (2000), critical cyberculture study contains four major areas of attention, each interdependent from one another: (1) they explore the social, cultural and economic interactions, which take place online, (2) they unfold and examine the stories we tell about such interaction, (3) they analyse a range of social, cultural, political and economic factors that encourage, enable the individuals or groups of people to have access to the Internet, (4) they assess the deliberate, accidental and alternative technological decision- and design-processes.
Statistics of the Internet explores various aspects of the interaction of Web users with the Web, presenting us with the methodologically measured quantities. Thus, it is the reason to follow up in brief what the scientists did, what directions were taken on while exploring the interaction with emphasis of the social aspects of this (graph 1).
Graph 1. Model of Web users’ interaction with the Web
Web page design
Source: own design, 2003
Source: own design, 2003
Societies are using the Internet as the medium, as the multifunctional medium for information, education, communication, entertainment, etc. The Web is different from the traditional media. In the earlier and even recent research interactivity has been recognized as a major distinction, which requires certain type of users and an active mode of communication (Richardson, 2001, McMillan, Sally, 2000)3. Interactivity could be defined as an assumed attribute of interpersonal communication and can be mainly understood as a sort of two-way communication system between senders and receivers. Interactivity allows users to freely choose which websites to visit, when to view them and implies the communicative exchange. However, the Web has greater functions: it could empower the user to produce (Burnett, Marshal, 2003). Empowerment to produce could be personal Web pages and other presentation of oneself to the Web audience. The Web enables the user to eliminate the natural divisions among production, distribution and exhibition as the network makes these divisions meaningless. The development of production function requires different notion of media literacy.
The users should have a stimulus or a motive to make the first click and after that to have a wish or a need to make the next one. A research of the Internet users’ motives has been done within the framework of users and gratification theory (Rafaeli, 1989, Eighmey 1997). In the late 90’s Korgaonkar and Lori D.Wolin conducted an extensive research of the motivation of Web usage. In their case, 6 focus groups interviews were used, and later a survey has been conducted. The following Web usage motives, listed in accordance with the points they obtained were defined: (1) Social escapism motivation. The Web is seen as gratifying due to its ability to provide diversion, arouse emotions and feelings. This factor also incorporates the motivation to overcome loneliness and finding companionship, (2) Information motivation. The Web allows people to acquire useful information quickly, easily and in an inexpensive manner, (3) Interactive control motivation. This factor allows users to freely choose which websites to visit, when to view them, etc. (4) Socialization motivation. This factor represents the role of the Web as a facilitator of interpersonal communication, (5) Economic motivation.
Success of interaction depends not only on interactivity, function of production and motivation. It also depends on the design of the Web, its’ attraction, innovation and convenience. Studies of the Web design have been started recently. Academics of Web design research formed the so-called research of human-computer interaction or HCI (Kollock, 1996, Baecker 1997, Kim 1999). They approach the interface as the critical site for interaction4.
The outcomes of the process of interaction are numerous and differ in nature. First of all, we have the identified users population. Measurement of the Web users is a recent phenomenon. It was started about four (4) years ago by market research companies (Jupiter Research, Gallup, International Data Corporation, etc.). Apart from market research companies, some governmental organizations also showed the interest in studies on Web users.5
The next a very important outcome of the interaction is the change of Web users’ values and attitudes. The interaction with the Web allows them to gain new knowledge and experience. As the outcome of interaction, Web users get connected to the communities and networks, find their identity and have superior experience, communication and membership. Finally, just as the outcome of interaction we could consider the formation of Virtual society (Burnett, Marshall, 2003, Woolgar, 2000).
What is the benefit for the Web, because of this interaction? Impressive. First of all, because of the interaction the Web is gaining value, both social and economic. Economic value, expressed in numbers of Web transactions, employment and other benefits could be measured and identified. Web economy is often called as the attention economy or new economy, where the traditional wealth becomes less important than the ability to capture peoples’ attention (Burnett, Marshall, 2003, Goldhaber, 1997).
The research of the social value of the Web attracted more attention from the academics, compared to the economic one. Social value is gained through social effects of the Web. The majority of studies focus on the issues of virtual communities, societies, networks and identities. Rheingold (1996) probably was the first to write on the existence of the online communities, saying “virtual communities are cultural aggregations that emerge when enough people bump into each other often over the cyberspace”. These virtual communities are based at the workplaces, or could be concentrated around a domineering person, or could be online unity of people having common interests.
A broad impact of the Internet on the expression and perception of social identities is relatively clear. There have been two major stances on the Internet’s social effects, which reveal two quite distinctive interpretations of its effects on one’s identity and integration into the larger social world: one that the Internet is positive, and the other that the Internet is harmful (Baym, Zhang and Lin 2001). However, the studies used different methodologies, asked different questions and drew completely different conclusions. The studies revealing negative impacts of the Web conclude that the use of the Internet challenges traditional relationships diminishes total social involvement, increases loneliness and depression (Homenet 2000, Kraut 1998, Nie and Erbring 2000). On the other hand, there have been several studies that suggest that the Internet may indeed enhance and enrich offline social life (UCLA Centre for Communication Policy 2000, Pew Project on the Internet and American life 2000, Dimmick 2000, Stafford 1999). These studies have found that the Internet users are more socially active than non-users and that the use of the Internet slightly increased the number of people in different interest groups. Especially e-mail is being used to support and maintain the meaningful relationships. Some studies have argued that the Internet use may improve emotional well-being rather than create loneliness and depression (La Rose 2001).
Virtual society is a special issue in the social Internet effect research. The term came into being following the well-known Oxford University project Virtual Society?, led by Steve Woolgar (Woolgar, Wyatt, Thomas, Terranova, etc. 2002). The project attempted to answer the question of dimensions and presence of Virtual society. The question remained unanswered till the end of the project. The conclusion was made that to understand the impacts of new technologies to societies we should focus the research on different aspects: impact on education, medical help, work, politics, identities, values, etc.
2. Internet information sources in Europe
The need for Internet measurement appeared around 1996-97 when the Internet started to become powerful. Today, Internet data is used for various purposes, various primary measurements as well as secondary information sources could be used to explore this area. The variety of information on the Internet is presented in the Graph 2.
Demand-side information in the form of end-user surveys is usually the most important in any Internet study. This type of information usually contains data on Internet audiences. The research companies that capture this information use the following primary methods:
Measurement from a sample of users who are metered (electronic measurement),
Measurement from a sample of users who are surveyed (recall measurement),
Measurement from analysis of server log files or other equipment.
Survey from a sample of users. These studies select a sample of Internet users and then query the respondents through standard survey methods. This could be done through telephone, face-to-face interviews, questionnaires in the mail or web-based interviews. The advantage of this approach is that detailed information about the individual users can be obtained, such as age, sex, income, geography and so on. Moreover, other useful advertising data can be collected, such as attitudes and awareness about certain technologies, hobbies, lifestyles, etc. This type of survey is wide-spread and used by the all major market and Internet research companies, such as Gallup, the US Department of Commerce, the International Data Corporation, the Eurostat, the Jupiter Media Metrix, the Online Computer Library Centre, TGI Europa etc.
Electronic measurement from a sample of users. These large-scale studies recruit a sample of Internet users, install a software meter and passively record usage of the computer and the Internet, and automatically transmit that data back to an office for tabulation and projection. This is the method employed by Media Metrix and it has the advantage that it measures all sites, makes inter-site comparisons easy and allows for reliable rankings. Demographic details are known about each of the users in the sample, thus allowing for audience composition calculations.
However, for several reasons, end-user surveys should not be used exclusively. Firstly, different survey methods yield different results depending on the size and selection of the sample, the way the survey questions are designed, and the inherent definition of web users. Secondly, a single survey seldom covers a number of countries
Macro-level statistics (economic, demographic and social variables, enterprise and industry statistics, etc.)
raph 2. Information sources on Internet usage and commerce
Demand-side research (consumer surveys, business end-user surveys, online questionnaires, etc.)
Supply-side research (web seller interviews and statistics, host count statistics, etc.)
INTERNET USAGE AND
Access technology data (installed bases of access devices, PC, ISP subscriptions, etc.)
Source: IDC, 2001
Supply-side information is especially important with regards to Internet commerce. In some areas, like business-to-business, which are characterized by a relatively small number of buyers and sellers with relatively high transaction levels, it is easy to find the major Web sellers and find out how much they sell online and thus unnecessary to ask a random sample of businesses how much they spend online. For this purposes, qualitative interviews could be used.
Macro-level statistics in various forms could be used to validate the data. For example, demographic statistics splitting the population into employees, students etc. and into different age groups could help to validate the number of work and school users in different countries. Economic data, including figures on the distribution of IT spending could help to assess the level of business-to-business commerce at country level. Also, if the study deals with future Internet developments, macro-level indicators are used as the variables for the model of prediction.
Access technology data consists of 2 types: access device data and Internet Service Provider (ISP) data can be used in different ways. First of all, they could serve as additional tools to estimate the current Internet usage level in the countries where survey material is limited, as well as to validate the available data in advanced countries. Secondly, access technology data could be used to predict future developments.
The Internet should be measured for a few purposes. The first may be called self-promotion. It is important for organizations to be able to make claims about the size and growth of their audiences or technologies. The second purpose, which was the driving force behind the efforts in launching specific Internet measurement is to support advertising planning, buying and selling. Organizations that offer Internet media opportunities to advertisers or their agencies use audience measurement data to help position and sell the inventory. This is the same role that television ratings, radio ratings and magazine audience planning.
In this article we will aim to explore in brief the demand and supply side information sources in Europe, when taking the application of the survey method based on the sample of users. Also, we will limit ourselves to the European scale research and to the major European scale companies, as well as to the domestic research, taking into account the Lithuania.
In Europe several market research companies, as well as the governmental organizations are collecting the statistical data on Internet usage in Europe and worldwide. Currently Eurostat, the statistic body of the European Commission has advanced significantly concerning the Internet statistics. Nowadays, research companies are multinationals, having the networked offices worldwide, thus, it’s difficult to establish clearly which of them are European. But we could mention the TGI Europa, the International Data Corporation, the Jupiter Research, and the TNS Sofres. In Lithuania collection of the data on the Internet also is organized by a several companies. Lithuanian Department of Statistics and the research company TNS Gallup seem to be the most reliable as using the adapted methodologies of the Eurostat and TNS Sofres. TNS Gallup is networked to TNS Sofres, and Lithuanian Department of Statistics works under the guidance of the Eurostat.
3. Methodology of the Internet surveys: where is the problem?
Recently, lack of data on the Internet has not been a problem; rather there are issues with the accessibility of data and the variety of methodologies used to collect the data. As with any research where the data from several sources are used, one has to look precisely whether they have the same or at least similar methodological background. But with Internet information, this problem is acute. Normally, when phenomenon is mature, for example traditional media, institutions involved in the data collection synchronize their work. This is not the case yet for Internet data – the variety of the data is high, some are even contradictory leading to issues of comparability, this is particular the case with cross-cultural research. What is worse, when the data tend to be released to the public without a precise description of the methodology (sample size and composition, contact method, questionnaire etc.), and request to uncover the method meets the refusal.
Lets’ examine some examples connected to the methodological problems of Internet information.
Penetration is considered to be amongst the major variables when researching the Internet as a form of media. It shows how widely spread the media is amongst the defined universe. In its applied expression, the penetration variable is calculated as the percentage of the universe population, which used the medium in a defined historic period. However, the defined period can differ depending on the methodology of research; it could be usage of media at least once per past 3 month, once per past month or once per past 7 days, etc.
Numerous market research companies calculate Internet penetration. In general two different definitions of target group are used: the entire targeted group (the entire population from 15-74 years, 16-74 years, or just +15 years) and households. The first variable relates to how many people use the Internet regularly at work, at home, at school or elsewhere. The second variable relates to how many private households have access to the Internet. In some cases Internet penetration could be calculated for companies, as well as for schools and establishments, etc. Thus, apart of the composition of the question on the frequency of Internet usage (ones per past month, 7 days, 5 days etc.), sample size and margins also can introduce the divergence of data on the Internet penetration among the research companies. Another factor that has to be taken into account is the small size samples which are used in many surveys (1% or less of the survey population). Such sample sizes can easily introduce distortions into the data. One also has to question the management of data collation: reliability of the interviewers and etc., as it well known that poor control of the surveys e.g. interview techniques, can lead to misleading primary data.
For example, method of sampling and sample margins is a problem when looking at data for Internet penetration into enterprises in Lithuania as data collected by the Lithuanian Department of Statistics differs from that collated by TNS Gallup. Research of TNS Gallup for the second quarter 2002 showed that Internet penetration in domestic enterprises reached 49%, whereas data from the survey of Information Technologies in Lithuania completed by Lithuanian Department of Statistics for the same period evidenced Internet penetration in domestic enterprises reached 65%. Difference, 16%, is significant. TNS Gallup used a sample of companies, based on the enterprises population found in the business catalogue, Lithuanian Department of Statistics used a statistic register of enterprises compiled at the Department. Both institutions are professional and have the sampling specialists however the results produced by the Department of Statistics are likely to be more reliable for a number of reasons: the sample was larger, and because the survey was conducted by a government department enterprises would feel an onus to provide correct and immediate answers (added to this the supply of incorrect data could lead to a company being charged).
Graph 3. Distribution of Web-buyers in European countries as % of web-users
Source: IDC, 2000; Taylor Nelson Sofres, 2001; TNS Gallup, 2001; Eurobarometer, 2001
Another example of problematic statistics concerns data used to quantify Web commerce. For example if we compare data on the share of e buying among the Web users collected by two companies – International Data Corporation and Eurobarometer (see graph 1 above) we can see that the data differ. It was not possible to obtain details of the methodologies used by these organisations, but by the nature of the data one can point to potential sources of error and difference. For example, frequency of Web buying - different categories can be used for this: ones within the past week, ones within the past 3 month, ones within the past 6 month, ones in the life, etc. Also, the definition of e buying itself can differ. This can be understood as just the order of the goods and/on services on the Internet, or wider – ordering, paying and delivering.
In this case, the IDC information seems more representative than the research from Eurobarometer. It hardly seems possible that the purchases share could reach 50% in the UK, when USA had only reached 27%. Of course, this distortion also could happen because of the mixture of terms in the media, whether the figures indicate the e-buying among the Web users or among the total population. However, in this case the proportions of the figures are not correct to state so.
The next example can be taken from the information analysed below in this paper. When the recent Eurostat data on e-commerce among population and enterprises are analysed; in 2003, according to the Eurostat data, the Web-commerce penetration in EU-15 among the population was higher than among enterprises. A methodological explanation can seen to account for this: only enterprises, which accounted e-commerce for more than 1% of their business transactions in the previous year were surveyed on e-commerce.
What we shall do if the data differs? Normally, we can apply the method of criss-cross examination of the sources. Information is considered reliable if the data from the different sources on the same phenomenon match.
When data differ, we must apply logical checking of the sources and their reliability (methodology, management practices, image, competences of human resources, etc.), and to choose the data from the one(s) which seems the most reliable. If the methodologies are the problem, then the source willing to disclose its methods should be the priority.
4. Internet policies in Europe
Just at the beginning of the introduction, in early 1990’th Internet had met the stand. But later on, and quickly, the usage of the Web for various purposes met support from the governments and from the public. Recently, the states are implementing theirs national Internet programs, which often fall under the titles of the programs of the Information society, Information technologies, etc. Regional programs, covering the states also had been developed. This mainly regards the Europe.
Today, the countries of the European Union are covered by the program eEurope. Program was released at the end of May 2003 by the European Commission in a Communication entitled “eEurope 2005: an information society for all”. The eEurope 2005 plan aims to stimulate secure services, applications and content based on a widely available brodband infrastructure. The action plan is based on two main groups of actions. Firstly, it encourages and aims to stimulate Internet services, applications and content (both on-line public services and e-commerce). This “content initiative” should lead to an increase in the flow and use of information. It is supported by a complementary, second action, that focuses on improving the underlying communications infrastructure, namely, the proportion of broadband and the development of tools and awareness in relation to security matters. To achieve these goals the plan outlines four main tools:
Policy measures: to review and adapt legislation; to ensure that legislation does not hamper new services; to improve access to a variety of networks, etc. Some key targets include a) connect public administrations, schools and health care to broadband; b) create interactive public services, accessible to all and offered on a multiply platforms, c) provide on-line health services, d) remove obstacles to the deployment of broadband networks, review legislation affecting e-business;
Exchange of experience, of good practices and demonstration projects;
Monitoring and benchmarking progress;
Co-ordination of existing policies: to bring out synergies between proposed actions, to provide a better overview of policy deployments and ensure a good information exchange between national and European policy makers and the private sector.
In Lithuania, development of the Internet is supported by the Governmental program on the development of the information society as well as by the number of private initiatives. To stimulate the penetration of the computers and the Internet, the tax law was modified. Recently, the income tax, which is paid by the physical persons (around 30%) is refunded by the Tax inspection on the part of computer buying and/or Internet introduction.
Tendencies of the Internet development in countries of the European Union
For outlining the tendencies of the Internet development in the countries of the European Union (EU) we will take the information of the Eurostat. This institution of the statistics in EU has well established methodological basis, organizational skills and modern management practices, well prepared professionals, sufficient finances, does not depend on the commercial whims, and uncovers the methodolology of the research.
Eurostat is collecting the information on the household Internet usage, and on the enterprise Internet usage. Both surveys are performed within the frame of e-Europe, under the programs of information technologies and information society.
In 2002, 97 192 households and 153 000 individuals were surveyed in the Member States. In 2003, 60 000 households and 88 000 individuals were surveyed. The age limits were defined within 16-74 years. For enterprises, 61 055 enterprises were surveyed in 2002, and 66 162 in 2003. Enterprises of all sectors, having at least 10 employees were covered by the sample. Both surveys covered also the questions on e-commerce penetration both among individuals and enterprises.
In summary, these surveys showed that:
Nordic countries have greater usage of Information technologies at individual and enterprise level;
Purchasing through Internet is generally more popular than selling both for individuals and enterprises;
There is a large discrepancy between Member States in the penetration and use of newer technology such as broadband and the use of the Internet to interact with public authorities.
Internet usage among the populations of EU is diverse (Table 1). Unfortunately, indicators of Internet usage among the mails are higher in all countries of EU. Non-EU country Island shows the lesser diversity regarding the gender gap. However, the Internet penetration gap among the gender groups in European countries is not significant and accounts to lesser than 10 %. The highest diversity is observable in Austria and non-EU country Norway.