• Public Folders – Folders Assistant Not Forwarding Messages

    A recent issue occurred where users were trying to create Public Folders Folder Assistant that would forward the post/inbound message to another Public Folder.  The users were able to configure this, however the message was never forwarded.  This is a short synopsis of how we resolved this problem.

    PREREQUISITES

    • Public Folders – both were mail enabled and visible in the GAL
    • Exchange 2010 SP2
    • Outlook 2010 SP1

     

    CONFIGURATION
    Within Outlook, open the Public Folder properties and select Folder Assistant (General Tab).
    image

    Add a rule:
    image

    Set that rule to forward to another Public Folder
    image

    NOTE: All other settings were optional, and did not impact this issue.

    SYMPTOMS
    • Get-PublicFolder shows that this folder does have HaveRules set to True
    • Changing the delivery to a mailbox, rather than a Public Folder, works just fine
    • Event Viewer’s Application Log contained the following event:

    Source: MSExchangeIS Public Store
    Event ID: 2028
    Task Category: Transport Delivering
    Level: Error
    Description: The delivery of a message sent by public folder 0000013456F2 has failed.
    To: MyPublicFolder
    The non-delivery report has been deleted.

    RESOLUTION
    As you can, the above event is related to Transport Delivering, from the MSExchangeIS Public Store.  Since delivery worked to a mailbox, but not the 2nd Public Folder, this led me to think permissions. 
    After modifying the Anonymous Client Permissions to include CREATE ITEMS, the Folder Assistant worked just fine.  By default, Anonymous only had Folder Visible, not Create Items.


    I don’t know how many people will ever run into this, but I figured this post may help reduce troubleshooting time for others.
    Good Luck!
    Doug

  • A Few Recommendations for Exchange 2010


    The following is a partial list of items that I recommend be reviewed for all Exchange 2010 server deployments.  The focus is to ensure that the environment is consistently configured, reliable, and performing optimally.  This is not an official, just something that I've been using for a while.

    Server Build
    - Confirm that hardware has been updated to the latest driver and firmware builds
    - Verify that the latest software builds have been installed, to include for Exchange, antivirus, monitoring agents, filterpacks, etc.
    - Operating System is running the latest build and has the recommended OS hotfixes


    Server Network Interfaces
    - Know if your environment explicitly denies IPv6 network traffic.  If so, then you may need to disable IPv6 on the NICs
    - NIC teaming is great for the MAPI/Public adapters - but should be configured to use Fault Tolerance (not automatic or load balance)
    - Network settings should be consistent on ALL servers, to include driver, TCP/IP Settings (i.e. DNS), and Binding order


    System Settings
    - Server's Page File should be moved off of the system partition
    - Server System Failure should be using Kernel Memory Dump
    - Proper file level antivirus exclusions should be configured - include for the file share witness, monitoring agents, cluster, IIS, and Exchange


    Active Directory
    - Verify that Active Directory has been properly configured (i.e. AD site links, no RODC, use 64-bit GS/DC running 2008 R2 is preferred, etc.)
    - AD Replication time should be optimally configured, documented, and confirmed that there are no replication errors occurring
    - All domain controllers are responsive (i.e. none are offline) and pass DCDIAG and other AD related tests
    - Subnets should be properly defined within the AD Site design


    Other Dependencies
    - Confirm that the hardware (server, storage, network, etc.) is working properly without any errors or warnings being generated
    - Network performance and reliability should be evaluated.  If network is slow or unreliable, users will feel that pain!
    - DNS should be reviewed for proper records and replication/configuration.  Remove any old records that may impact messaging.


    Client Access
    - All AD sites are defined within your AutoDiscoverSiteScope, including client-only sites
    - Enable Kerberos for the CAS Array
    - Enable logging on IIS and the CAS and track which clients are accessing your environment
    - Have recommended minimum client builds for your environment and know how to parse the logs to determine builds


    Transport
    - Confirm that EWS and OWA are properly configured to allow for your organization's message size limits
    - Verify that message limits are consistently configured (server, global, connectors, etc.)
    - Routing components should be evaluated and remove any unnecessary transport settings (ex: Accepted Domains, Connectors, etc.)


    Public Folders
    - If using dedicated PF servers, PF should be configured to replicate to all of those servers (min of 2 copies)
    - Does you Exchange aware antivirus software scan Public Folder replication messages? Should it?
    - To improve Public Folder access performance, remove deleted security objects from the client permissions


    Security
    - Should Administrator Audit Logging be enabled?
    - Windows Firewall should be enabled and properly configured to work with all applications installed on the server
    - Rarely should you modify the default RBAC groups.  Rather make new groups and manage the permissions thru that model
     

    Some other things...
    - Go thru the Exchange Best Practice Analyzer health check
    - Be sure to follow the Mailbox Storage Calculator - either provided by MSFT or by your storage vendor
    - Determine your requirements for custom Client Throttling Policies (ex: service accounts)
    - Have you set the External Post Master Address?


    Hope this helps!
    Doug

  • Exchange 2010 DAG - NetworkManager has not yet been initialized

     

    Recently, in two separate occasions, I had to assist in resolving an issue where a member of an Exchange 2010 database availability group (DAG) failed to participate in the DAG's Cluster Communications and therefore were unable to bring any database on those servers online.  In both instances, this occurred after the server was rebooted.  While each issue had a slightly different resolution, I am fairly confident that they are related.  And since it took awhile to isolate and resolve these issues, I'd thought I would share this experience regarding these issues.

    Before I begin, in neither scenario did we lose quorum of the DAG.  Also, the symptoms of both scenarios were nearly identical. 

     

    SYMPTOMS

    • Viewing these servers from Failover Cluster Manager show them with a STATUS of DOWN.
    • Network Connections for these members are listed as UNAVAILABLE
    • Cluster Services Starts on these servers, however the following event is logged in the Event’s System Log
      Log Name:      System
      Source:        Microsoft-Windows-FailoverClustering
      Event ID:      1572
      Task Category: Cluster Virtual Adapter
      Level:         Critical
      Description:  Node 'SERVER' failed to join the cluster because it could not send and receive failure detection network messages with other cluster nodes. Please run the Validate a Configuration wizard to ensure network settings. Also verify the Windows Firewall 'Failover Clusters' rules.
    • Attempt to view Exchange DAG status or network returns error:
      A server-side administrative operation has failed. 'GetDagNetworkConfig' failed on the server. Error: The NetworkManager has not yet been initialized. Check the event logs to determine the cause. [Server: SERVER5.Contoso.inc]
          + CategoryInfo          : NotSpecified: (0:Int32) [Get-DatabaseAvailabilityGroup], DagNetworkRpcServerException
          + FullyQualifiedErrorId : A6AA817A,Microsoft.Exchange.Management.SystemConfigurationTasks.GetDatabaseAvailabilityGroup
    • Cluster Log Shows:
      WARN  [API] s_ApiOpenGroupEx: Group Cluster Group failed, status = 70
      DBG   [HM] Connection attempt to SERVER01 failed with error WSAETIMEDOUT(10060): Failed to connect to remote endpoint 1.2.3.45:~3343~.
      INFO  [JPM] Node 7: Selected partition 33910(1 2 3 4 5 6 9 10 11 12 13 14) as a target for join
      WARN  [JPM] Node 7: No connection to node(s) (10 12). Cannot join yet
    • Cluster Validation Report shows:
      Node SERVER01.Contoso.inc is reachable from Node SERVER5.Contoso.inc by only one pair of interfaces. It is possible that this network path is a single point of failure for communication within the cluster. Please verify that this single path is highly available or consider adding additional networks to the cluster.
      The following are all pings attempted from network interfaces on node SERVER5.Contoso.inc to network interfaces on node SERVER05.Contoso.inc.
    • Network Trace was showing that cluster communication was in fact going thru to all other nodes on port 3343 and responses were returned. 
    • There was no change in errors even after disabling Windows Firewall and removing file level antivirus and security products from the servers.
    • Removing NIC Teaming from the server did not work


    RESOLUTION #1
    In this scenario, this occurred within our lab running on Hyper-V.  Based on hyper-V's network summary output, I could see that the servers really were not communicating properly.  Yes, they could ping and they could authenticate with the domain, but cluster communication was failing. 
    The resolution was to consistently configure the network settings on all DAG members & to reset the hyper-v network properties.  This meant:

    • Confirm that the networks were identically configured between all DAG node members (i.e. REPL / MAPI Networks, TCP/IP settings, Binding Order, Driver versions, etc)
    • Disabled IPv6 from the servers [NOTE: It is recommended to leave IPv6 enabled, even if you do not have an IPv6-enabled network!  In most scenarios, disabling IPv6 on an Exchange 2010 should be a last option.]
    • Once rebooted, all was working fine.
    • Edit the Hyper-V Network Properties Page for this VM


    RESOLUTION #2
    In this scenario, this occurred in production.  Ultimately we decided to change the IP address of the 'broken' DAG member and reboot the server again.  This allowed the server to properly register its network connections with the cluster DB (ClusDB) and all other nodes were able to talk properly.  This allowed the DAG member to rejoin the DAG and then all databases were able to mount and/or replicate their copy successfully. 

    We found that not all of the production DAG members were identically configured with their network settings (i.e. 2 DAG members did not have a REPL network configured).  Per http://technet.microsoft.com/en-us/library/dd638104.aspx#NR, "each DAG member must have the same number of networks".  We fixed the networks and updated the servers to include the recommended hotfixes - http://blogs.technet.com/b/dblanch/archive/2012/02/27/a-few-hotfixes-to-consider.aspx


     

    Questions/Answers
    Why did changing the IP address of the DAG member work?   Well, not exactly sure but we believe that this was either a stale TCP route or something in the CLUSDB was preventing any server with that IP address from joining the cluster.
    Did you reboot all of the DAG member server before or after changing the IP address?  No, we did not want to risk losing another server within the DAG (had already lost 2 of the 12 members).  We did, however, reboot all of the servers in the lab scenario.
    Did you ever lose quorum of the DAG? Nope.
    Do you think that you could have prevented this?  Maybe, if we had applied all of the hotfixes outlined here & confirmed all network settings were identical on all DAG members, then maybe servers might not have caused this issue.   There may be other things causing this, but it is always recommended to resolve the known issues first.


    Good Luck.
    Doug